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SOME DISTRIBUTIONS OF SAMPLE MEANS 

George W. Brown and John W. Tukey 
RCA Laboratories and Princeton University 

1, Summary. It, is shown that certain inonomialB in normally distributed 
quantities have stable distributions with index 2“\ This provides, for k > 1, 
simple examples where the mean of a sample has a distribution equivalent to 
that of a fixed, arbitrarily large multiple of a Bingle observation. These examples 
include distributions symmetrical about zero, and positive distributions. 

Using these examples, it is shown that any distribution with a very long tail 
(of average order > x“ 3;2 ) has the distributions of its sample means grow flatter 
and flatter as the sample size increases. Thus the sample mean provides less 
information than a single value. Stronger results are proved for still longer 
tails. 

2 Introduction, This paper derives and exploits certain elementary ex- 
pressions for stable distributions. The practicing statistician may be inter- 
ested in the general discussion of results, going as far as Section 5. The reader 
interested in probability theory may be interested in 

(i) the simple monomials in normally distributed quantities which are 
shown to be stable (Section 7) 

(ii) the resulting bounds on the densities of these stable distributions 
(Section 8) 

(iii) Theorem A, which forms a partial converse to the Central Limit 
Theorem. 

It should be pointed out that examples of stable chance quantities arising from 
infinite series (Khintehine 1937, [2], (31) and integrals (Levy 1935, [4]) are already 
known. These results form a natural part of broader investigations into 

(i) the relative value of the mean, the median, and their competitors 

(ii) the properties and distributions of simple functions of normally dis- 
tributed quantities. 

3. Stable distributions. One of the typical properties of the normal dis- 
tribution with zero mean is that the distribution of the mean of a sample of n 
has the same shape but is compressed by the factor s/n. The Cauchy dis- 
tribution is well-known for the property that the mean of a sample of n has 
the same distribution as a single observation. 

Statisticians have not widely appreciated the fact that there are symmetric, 
smooth distributions for every positive X < 2, with the property that the dis- 
tribution of the mean of a sample of n has the same shape as the original dis- 
tribution but is spread out in the ratio n 0-X)/ \ These are the symmetric stable 
distributions of index X. 

It is interesting to note that if X = .001, then the mean of a sample of two 
is 2 359 times as variable as the mean of a sample of one For small X the means 
become unduly variable with a rapidity which is difficult to comprehend. 

1 
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4, Outline of results. Section 7 is devoted to the proof that certain mono- 
mials in normal variables are stable of index 2~ k for integral k. Both symmetri- 
cal a nd positive cases are shown to exist. For fc = 0, the symmetrical case is 
the familiar Cauchy distribution, which is the distribution of Student’s “l" 
on one degree of freedom, while the positive case for k » 1 is the distribution 
of Snedecor’s “F" on °° and 1 degrees of freedom. 

In Section 8 it is shown that the symmetrical stable distribution of index X 
has a density which is 

(i) bounded by a constant 

(ii) bounded by a constant times | x for the values X ~~ l, J, J, 
• ■ , for which elementary examples are available. It is conjectured that 
this is true for all X < 2. 

In section 9 it is shown that, if a distribution has one long tail in the mtse that 

(1 1) hm | sb r P{* < X < x + h] > 0, 

x~*eo 

for some h and one of the above values of X (the lim may lie taken cither as 
x — > +co or as x — «0, then the distribution of the sum of a sample of n 
spreads out as fast as for a stable distribution with the same value of X, This 
may be restated for the mean as follows: 

(l) A distribution has a long tail of order | x |" (1 1X1 if (1.1) holds for some 
h > 0 and choice of sign for x. 

(ii) If the distribution has a density f(x), then (1.1) is a comcgmncr of 

(1 2) f(x) > — jyjijix , A > 0. 

(iii) The distribution of the mean of a sample of n will be said to spread out 
as fast asn , if the distance between any two percentage points for the mean of a 
sample of n is ultimately larger than a fixed multiple of n k . 

(iv) Theorem A. If the distribution of X has at least one long tail of order 

| a: | , where X = 1, )j, • ■ • , then the distribution of the mean of a sample 

of n values of X spreads out as fast as 

Section 10 presents a simple example of a distribution symmetric about, aero 
with such long tails that 

(i) the distribution of the sample mean spreads out faster than any power 
of n, 

(ii) the median of a sample of any size fails to have finite momenta of 
positive order, integral or fractional. 

5. Consequences for applied statistics. The basic consequences of these 
results for applied statistics can be summarized in the following statements. 

(a) The positions that the Cauchy distribution is an isolated case, or else 
an extreme example of pathology, are now untenable. 
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(b) The use of the mean of a sample as a measure of location (or, when 
dealing with positive distributions fixed at zero, as a measure of scale) im- 
plies a belief that the tails of the underlying distribution are not too long. 

(c) It is probable that the relative efficiencies of mean and median are 
greatly affected by the length of the tail. 

The importance of this last statement lies in. the fact that direct empirical 
evidence about tail length is very hard to obtain. The mean is well known 
to be more efficient when the underlying distribution is normal. Normality of 
the tails of practical distributions is rarely based on firm empirical evidence. 
In these practical cases, greater efficiency of the mean should often not be 
assumed without empirical confirmation. 

It may be argued that the results of this paper apply to the limit as n — i °° 
and to the behavior of the distribution near infinity, while the practical problems 
involve moderate values of n and the behavior of the distribution near its 5%, 
1%, 0.1%, 95%, 99%, and 99.9% points. This is undoubtedly true, but the 
authors believe, and have some evidence to confirm, the following correspon- 
dence principle : 

If certain mathematical tails imply certain asymptotic behavior, then 
similar practical tails imply similar behavior in moderate samples. 

Here “mathematical tails” refers to behavior at infinity while practical tails 
run from the 5% to the 0.1% point and from the 95% to the 99.9% point. 

It is of some interest to point out that Snedecor’s “F" provides applications 
of Theorem A. If N values of F are averaged, where each was obtained on,?q 
and n 2 degrees of freedom, then as N increases 

(i) if «2 > 2, the average converges to 1 (im all percent points converge 
to 1), by the Central Limit Theorem 

(ii) if n 2 = 2, the percent points of the average stay a finite distance away 
from each other, by Theorem A 

(iii) if nj = 1, the percent points of the average separate from each 
other at least as fast as a constant times ■\/N ) by Theorem A, 

The consequences of Theorem A follow from the asymptotic density of F, 
which is a constant times F~ i ~ in ' t . 

6. Notation and terminology. Chance quantities (random variables) 
will be denoted by capitals and their values by lower case letters. The same 
letter will generally be used, so that x will frequently be a value of X. 

The letter S, with or without indices, represents a standard deviate (nor- 
mally distributed quantity with zero mean and unit variance). Unless other 
wise specified all sets of chance quantities will be assumed to be independent. 

Cumulative distribution functions will be referred to simply as “eumulaUves” 
and will be denoted by capitals. Probability density functions will be referred 
to as "densities” and will be denoted by the corresponding lower case letters. 

The convolution of two cumulatives F and 0 will be denoted by F*G. It is 
the cumulative of sums of two independent values, one from each distribution. 
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7 Special stable distributions. Cauchy (1853, [1]) recognized that dis- 
tributions with characteristic functions of the form 

-«l *1* 
e 

were stable, A distribution is stable if whenever k and l are positive and A 
and B are independent chance quantities distributed according to the name 
law, then kA + IB is distributed like a fixed multiple of A. It is known (L6vy 
1937, [5], pp. 94 ff.) that any stable distribution lias a characteristic function 
of the form 

e > 

where 0 < X < 2, a > 0, and | | < | atan JirX |, Eacli stable distribution 
thus has an index X such that hA + IB and (h x + t) l,s A have the same dis- 
tribution when A and B are a sample of two from the given distribution. 

This section exhibits, for every integral ft, simple monomials of standard 
deviates which have stable distributions of index 2~\ 

(7.1) Theorem: Let S, So , Si , St , • • • be a sequence of independent standard 
deviates, Then 

(l) Go = S/S 0 and Po = 1 
are stable of index 1 = 2~°. 

(li) Ci - S/SoSl = Co/Sl and Pi = 1/5? = P a /S\ 
are stable of index $ = 2~ l . 

(lii) C 2 « S/SoSlS? - Ci/ST 
and Pt = 1/SJjSS* « Pi/ St 
are stable of index f = 2 _z . 

(iv) m general , C k = C k . x /Sf and P k - Ph-i/St * 
are stable of index 2~ k . 

The Ck are a sequence of symmetrically distributed chance quantities which 
are here presented as monomials in normally distributed chance quantities and 
whose stability properties imply for k > 1 that the distributions of means of 
samples spread out as the sample size increases. The Pi are a similar sequence, 
all of whose values are positive. 

The stability properties of the C k follow, directly, by means of elementary 
composition properties of characteritie functions, from 

(7.2) Lemma: The characteristic function of C k is 

E(e <10 *) =s exp(-2 | §t | J “*). 

Proof: The case k = 0 is the familiar Cauchy distribution. Denoting the 
normal cumulative by N( s), it is seen that 

E{e<tCo) “ £ £>- <«*(«) ww 

= £ ®p (- s< 2 / So) diV(so) 
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The second definite integral is well known (e.g. Formula 495 in B. 0. Pierce’s 
table). Assuming the result for k— 1, write 

E(e“ Ck ) = P exp (ifCWsf) dF^fiC^) dN(s k ) 

J—OO J— 00 

= £exp 

= exp (-2 | i<| a-k ), 
precisely as in the derivation for A; = 0. 

The stability properties of the P k follow, by completely analogous use of 
the moment generating function, from 
(7.3) Lemma: The moment generating function of P k is 

E(e~ tPt ) = exp(-2(!ff l ), t > 0. 

Proof: The trivial case k — 0 is verified directly, since P 0 = 1. The induction 
from k— 1 to k is identical with the derivation of (7.2), as is seen by writing 

E(e~ tP >) = f [" exp (-tPk-x/s?) dG^(P k ) dN(s k ) 

» — ■ 00 v 0 

= f “ exp (-2 M‘- k * l /sl) dN(s k ) 

V— 00 

- exp (-2(40 • 

In order to verify the stability properties, consider distributions with char- 
acteristic functions of the form exp{~ d j t | x ). If A and B are independently 
distributed according to this distribution, then 

JS(e iuu+mal ) __ E(e <tu )E(e' ,mB ) = e ~ i <l x 

for l, m > 0. Parallel application of the moment generating function yields 
piecisely analogous results. , 

8. Some auxiliary results. It is the purpose of this section to establish 
some results concerning stable distributions. It will be convenient to state 
and prove some of these lemmas in general form. 

(8.1) Lemma; If X has a density fix) satisfying 

m < a\x r, 

then X has finite negative moments of orders down to — (1— a). 

Proof: If — (1 — a) < p < 0, then 

i x m < a i x r + *, 

with — a+p > — 1. Now 

[ |x| fl /(x)ds< f /(x) dx + f |x| 13 f(x) dx + f f(x) dx 

J— op J— DO J— 1 Jj, 

< J fix) dx + A \x [ -a+|S dx < co , 
which proves the lemma. 
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(8.2) Lemma: If X has a density fix) satisfying 

fix) < a i % r 

and if Y has a density g{y) and a finite negative moment of order then 

the density h{x) of XY satisfies 

h(x) < A i j x I" 0 . 

Proof: The density h(x) satisfies 

Hx) = f {f(x/0g(t)/\t])dt 

J—OQ 

< A\t\ a \xrg(t)\t\~ l dt 

J— oo 

= {£ a 1 1 r n - fl) o(t) i * r - x, i # r . 

(8.3) Lemma: The density hip) of 

Y k = SiStfiSif (S k f, 

where, S, Si, S 2 , ••• S k are independent standard deviates, satisfies 

hiy) < A \y 

and hence Yi has finite negative moments of all orders down to ~2“\ 

Proof: Let g k (x) be the density of 

X* = lS h )*\ 

then 

Qkix) ~ (27r) H 2 _l 'exp(-^ 1 “*)a!~ 1+r ‘, 

whence 

9k{x) < Ai ! x |“ I+J "*, 

For k = 0 this is the desired result; the other cases follow by induction, using 
Yk ~ XkY/t-i and lemma (8.2). The final statement of the lemma then follows 
from lemma (8.1). 

(8.4) Theorem: For X = 2~ fc , the density m\(x) of CV- satisfies 

(*) m x (x) <A\x r cl+2 "* J «= A | x r (1+Xl , 

and also 

{**) mx(x) < Ai , 

Proof: By definition, C h = S/Y k . By lemma (8.3) the density of Y k satisfies 

hiy) <Ai\y 
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The density of 1/F* satisfies 


l k (z) = l h k (l/z) 
z 


< \ zrA 1 \»\ Mri ^Ai\z\~ a - tirl) . 

Since S has a finite moment of order 2~\ it follows from lemma (8,2) that the 
density of S/Yk satisfies the desired relation (*). Since S has finite moments 
of all positive orders, so does S 2 * and therefore Yu . Thus \/Y k has moments 
of all negative orders, including — 1. Since the density of S is bounded, lemma 
(8.2) implies the same for S/Yu and hence for C \ . This completes the proof 
of the theorem. 

9. Distributions with a long tail. The purpose of this section is to prove 
(9.1) Theorem: If D has a cumulative F(x) such that for some h > 0, either 


lim 


F( x + h) ~ F(x) 
|*| " <1+M 


> 0, 


or 


lim > 0 
-tiSL | i-IHM >9) 

£-*— oo t' 


where A = 2 k for k = 0, 1, 2, • • • , and if k n (a) is the a~point (100a percent point) 
of the distribution of sums of n independent values of D, then 


1 i m I^ n (*^ 2 ) s. rt 

T n V* 


■whenever an > a 2 . 

We begin with some lemmas. 

(9.2) Lemma: If 

F(x) = (t F'ix) + (1 - 0)F"ix), ] 

0 < j3 < 1 

<?(*)= 0F’{x) + (1 - 0)1(*), J 

where F'(x) is a cumulative symmelrib about zero and unimodal, F"{x) is a cumula- 
tive symmetric about zero, and 1 (x) is the cumulative concentrated at zero ( whence 
F{x) and G{x) are cumulaiives) , and if F n {x) and G n (x) are the cunmlatives of 
sums of samples of n from F(x) and G(x) respectively, then 

F„(x) < G n {x), x > 0, 

F„(x) > G n (x), x < 0. 

Proof: We begin with the case n = 2, where 

Ft = 0 2 F'*F' + 2(3(1 - (3 )F'*F" + (1 - (3 ?F"*F" 

and 

Gt = 0 *F'*F' + 2/3(1 - 0)F' + (1 - (S) 2 1. 

The lemma will have been proved for n = 2 if we can show that 

F'*F"(x) < F'{x), x > 0, 

F'*F"{ x) > F'{x), x < 0. 
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Now, if X > 0, 

F'*F"(x) = f F'(x ~ s) dF"(s) 

fcL,*} 

= f [F'(x - t) + F'{x + «)} dF"( S ) 

Jo 

< 2 r F'(x)dF"($) «= F'(ar), 

Jo 

where the first equality follows from the symmetry of F\ the inequality follows 
from the unimodality of F', and the last equality follows from the aymmctry 
of F". The inequality is reversed if x < 0. 

For general n, 


F n = s(^(i - er k fWU, 



where F* (the convolution of h copies of F‘) is the cumulative for minis of k 
independent values from F\ and F* is similarly related to F", Since /»* ia 
unimodal and symmetric and since F'Lk is symmetric, the same argument can 
be applied term by term to complete the proof of the lemma. Tile requirement 
that F" be symmetric could be replaced by the formally weaker condition that 
F*(0) = * for all k. 

(9.3) Lemma: If 

F(x) = fiF ft ,(x) + (1 - fi)l(z), 0 < < 1, 

where Fi\)( x) is the cumulative of C * , with \ ~ 2~ k , and if K n (a) is as defined in 
(9.1), then 

lim n*K t (a) = ft* KM, 
where K ( \)(a) is the a-point for F,\)(x), 

Proof: Let F „ and F w „ be the cumulatives of sums of n from F and F w re- 
spectively, whence 

Fau(x) = F^fn^x). 


w<») - £(jy<i - r m ,(z) 


Then 
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The characteristic function of (njS) 1/X x is 

E(e U{nP)ll ' x ) = £^(1 - /J)" - * exp (— dKnpy^k^tf, 

- 5(;y» -w-expt-Siih, 

where the characteristic function associated with F(X)(s) is exp(—d j t | x ). Thus 
we have to deal with 

exp (“S ul> ) 

where k has a binomial distribution with mean np and variance npi 1 — P), 
so that k/np converges stochastically to unity. This implies that 

Inn £(e‘' ( " wv> '*) = 4 e' <i| ' |X 

uniformly in every finite interval, whence (n(H) llx X converges stochastically 
to Ck , which completes the proof of the lemma. 

(9.4) Lemma: If the symmetric cumulative Fix) has a density fix), andif constants 
Ci and ci, exist such that 

fix) > min (cj , Cj | x |~ a+x> ), 
where X = 1, 1, ■ ■ • , then, if a ^ 

lim | ri~ 1,x K„ia) | > 0, 

Proof: According to theorem (8.4) there are constants dx and dz such that the 
density of C* is bounded by min {d\ , di\x | _(1+X) ). Hence 

Fix) - 0F m jx) 

1 - P 

is monotone when ft = min (ci/di , c^/df), and hence is a distribution function. 
By lemma (9.2) the a-points of F lie outside those of pF^ix) + (1 — |3)l(a;), 
and these, by lemma (9.3), increase at least as fast as An~ i,x . 

(9.5) Lemma: If the density of D exists and equals fix), and if either 

ljm |x| 1+x /(x) > 0, 

I -++00 


lim |*| 1+ V(*) > 0, 

60 

where X = 1, 4, i» ij - * * > then, for on > a 2 , 

lim tT iA {K„ (a i) - K n (a,)) > 0. 

n 



10 


GEORGE W. BROWN AND JOHN W. TUKEY 


Proof: Let D\ and Dt be independent with the distribution of D. Then 
Di — T>i has a symmetric density given by 

g(x) = f /(* + s)f (s )ds. 

J— « 


If 


ue I * l I+x /(*) > o, 


then for suitable h and « > 0, 

f(x) > e I X r (1+x \ for all X > h. 

Therefore, for x > 0, writing 7 = — (1 + X), 

g(x) > / /(* + s)f (s)ds > t | h + 1 4- x y | li + 1 1 7 = bi | b 2 + a: | 7 , 

Jh 

Now 

bi\b 2 + x | 7 > min {b^b? , bi2 7 | x I 7 } 
and hence, for x > 0 and suitable C\ > 0, <% > 0, 

g{x) > min {c t , c^xl 7 ). 

Since g(x) is symmetric, this is also true for x < 0. If 

ljm | x } 1+x /(z) > 0, 


then a similar argument proves the same result. 

Let Ki n {ct) be the «-point for the sum of n values of Di — Z> a and K n (a) be 
the a-point for the sum of n values of D. The most elementary relation be- 
tween these functions is 


I KJ* ± Mm - Atf) | < [ KM) - KM) |. 

To see this, observe that the sum of a sample of n values of Di — D3 is the 
difference of the sums of two independent samples of n values of D, and that 
there is a probability of (m — a 2 ) 2 that both of these sums will fall between 
KM) and KM)' Thus the intervals (— | K n {ai) - K n {af) |, 0) and (0, 

| K n {a 1) - KM) | ) are each occupied by the difference with probability 
> |(m) - a 2 ) 2 . Since K Sa (i) = 0, the relation follows. Hence, if ai > at , 

Um n^{KM) ~ KM)} > IjE ± M ~ «*)’) 

and by lemma (9.4) applied to the distribution of Di — H 2 this latter Hm is 
positive, which completes the proof of the lemma. 

With the ground prepared, it is now possible to complete the 
Proof of the theorem : Let h be chosen so that 

lun I r. l 1+) ' CP(r O- n 
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This can always be done, if X is replaced by —X when necessary. Let U have 
the uniform distribution on the interval (0, 1) and consider the variable D + hU. 
This variable has a density given by 


and, therefore, 

lim | x | 1+x g(x) > 0. 

X — DO 

Let K n (a) be the a-point for the sum of a sample of n values of D, and let A r *(«) 
be the a-point for the sum of a sample of n values of D + hU. Since | hU \ < h, 
it follows that 

| KM - K*(et) | < nh. 

Therefore, if 1/A > 1 and > a 2 , 

{A„ («0 - K n (a,)) = limn" 1 '' {JC* n ( ttl ) - K*„ («,)}, 

X-tQO X—+OQ 

and by lemma (9.5) the latter lhn is positive. 

The case of X — 1 requires a slightly more delicate argument. The sum of 
a sample of n values of hU is asymptotically normally distributed, and hence 
it is less than Apt, for a suitable Ap , with probability /9. Therefore 

Kn(«fi < K*M) < ICM + A a n} 

and the same process yields the desired conclusion. 


10. A distribution with very long tails. A somewhat pathological example 
is provided by the symmetric cumulative 

F(X) = Zn(e 2 + (ml)* * - 0> 


1 ln(e 2 + | » | ) > * - °' 


which has the density 


f(x) = 


(* + \x\){ln(c l + |)} 2 


Since 


lim | x 1 1+ V(*) = « for all X > 0, 


it follows from theorem (9.1) that the distribution of the sum of a sample of 
n values of X spreads out faster than any power of n. The same must therefore 
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be true of the mean of a sample of n. There is clearly no use in taking any 
kind of mean of such a sample. 

There will, of course, be something to gain by taking the median of a sample 
of 2n + 1, since the distribution of the median always shrinks together as 
n — > and whenever, as is true here, the density is finite and continuous 

at the population median, the distributions of the sample medians shrink toward 
the population median 

This does not prevent some pathology, however, since the cumulative for 
the median of 2n + 1 takes the form 

where P(t) is a polynomial of degree n with no constant term. Thus, for large 
negative values of x, the cumulative for the median is asymptotically 

(2n+ l) 1 1 

(7i!) 2 (n + 1) ’ {ln{e 2 + |x|)} B 
and the corresponding density is asymptotically 

(2n + 1)1 n 

HV + DW^ + lxDi^V + kl) 

and it follows that the median has no moments of any positive order, integral 
or fractional. This is true no matter how large the sample usedl 
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UNBIASED ESTIMATES FOR CERTAIN BINOMIAL SAMPLING 
PROBLEMS WITH APPLICATIONS 1 

By M. A Girshick, Frederick Mosteller, and L. J. Savage 

U. 8. Department of Agriculture; Statistical Research Group, Princeton Univer- 
sity; and Statistical Research Group, Columbia University 

1. Introduction. The purpose of this paper is to present some theorems with 
applications concerning unbiased estimation of the parameter p (fraction de- 
fective) for samples drawn from a binomial distribution. The estimate con- 
structed is applicable to samples whose items are drawn and classified one at a 
time until the number of defectives i, and the number of nondefectives j, simul- 
taneously agree with one of a set of preassigned number pairs. When tliia 
agreement takes place, the sampling operation ceases and an unbiased estimate 
of the proportion p of defectives in the population may be made. Some examples 
of this kind of sampling are ordinary single sampling in which n items are ob- 
served and classified as defective or nondefective ; curtailed single sampling where 
it is desired to cease sampling as soon as the decision regarding the lot being in- 
spected can be made, that is as soon as the number of defectives or nondefectives 
attain one of a fixed pair of preassigned values, double, multiple, and sequential 
sampling. In the cases of double and multiple sampling the subsamples may 
be curtailed when a decision is reached, while for sequential sampling the proc- 
ess may be truncated, i e an upper bound may be set on the amount of sampling 
to be done. In section 3 expressions are given for the unique unbiased esti- 
mates of p for single, curtailed single, curtailed double, and sequential sampling. 

One or two of the illustrative examples of section 3 may be of interest because 
their rather bizarre results suggest that some estimate other than an unbiased 
estimate may be preferable; but the discussion of estimates other than unbiased 
ones is outside the scope of this paper. 

2. The estimate p. For the purposes of the present paper the woul point will 
refer only to points in the x?/-plane with nonnegative integral coordinates. 

We shall need the following nomenclature. A region If is a set of points con- 
taining (0, 0). The point (x 2 , yf) is immediately beyond (xj. , i/,) if either *~ 
Xi + 1) 2/a = 2/i or x 2 = x x , = Vi + 1- A path in R from the point a<> to the 
point a n is a finite sequence of points a 0 , ai , ■ ■ ■ , «„ such that «, (i > 0) is 
immediately beyond a,-! , and a, t R with the possible exception of . A 
boundary point, that is, an element of the boundary B of R, is a point not in R 
which is the last point a n of a path from the origin. Accessible points are the 
points in R which can be reached by paths from the origin, while inaccessible 
points are the points which cannot be reached by any path from the origin. 

1 This paper was originally written by Mosteller and Savage. A communication from 
M A Girshick revealed that he had independently discovered for the sequential probability 
ratio test the estimate f>(a) given here and demonstrated its uniqueness. For purposes of 
publication it seemed appropriate to present the results in a single paper. 

13 



14 


M. A. GIRSHICK, F. MOSTELLER AND L. J« SAVAGE 


All points are thus divided into three mutually exclusive categories: accessible, 
inaccessible, and boundary points. The index of a 'point is the sum of its co- 
ordmates, and the index of a region is the least upper bound of the indices of its 
accessible points. A finite region is a region for which the indices of the acces- 
sible points are less than some number n. In particular a region containing 
only a finite number of points is finite. 

Paths may be thought of as arising by a random process such that a path 
reaching a. = (x, y), a, e R, will be extended to a,+i = (x, y + 1) with probability 
p or to «Ui =0+1, y) with probability q = 1 - p. We exclude p = 0, 1 
unless these values are specifically mentioned. When a path is extended to a 
boundary point of R the process ceases, It is clear from the definitions that for 
a finite region R, paths from the origin cannot include more points than n + 2 
where n is the index of the region This means that a path from the origin can- 
not escape from a finite region and that the probability that it strikes some 
boundary point is unity. It is clear that each path from the origin to a boundary 
point or an accessible point has probability p v q x , if the point has coordinates 
(x, y) . We will need the following statements which are immediate consequences 
of the discussion above: 

A. The probability of a boundary point or an accessible point being included in a 
path from the origin is P(a) = k{a)p v ^ ) where k{a) is the number of paths from the 
origin to the point. W e shall call P(et ) the probability of the point, 

B, For a finite region 53 -P(«) = 1, i.e. the sum of the probabilities of the 


boundary points is unity 

Any region for which X) +(«) = 1 will be called a closed region. 

a tB 

Of course, all finite regions are closed; but it is convenient to have a condition 
such as that supplied by the following theorem guaranteeing the closure of some 
infinite regions as well 

Theorem 1. A sufficient condition 2 that a region R be closed is that lim inf 


A {n)f \/n = 0, where A(n) is the number of accessible points of index n. 

Proof We consider the ascending sequence of finite regions R. n , each con- 
sisting of the points of R whose indices are less than n. The boundary B„ of 
R n can be written as the set theoretic union K n |J A n , where K n is B„ fl B, and 
A n are the accessible pomts of R of index n. If a t B n and P„(a) is the prob- 
ability of a with respect to R n , it is easily seen that for a t K n , P n (a) - P(«), 
Since every point of B is ultimately contained in the ascending sequence K„ , 


X P{a) = lim X] P(a) 

n-*a 0 atK n 


= lim 53 P» (a) < 1, 

n-»» a<AT n 


the inequality being a consequence of statement B. But 53 P„(a) is mono- 
tonically decreasing because 53 B n (a) is monotonically increasing with n 
while 53 Pn(a) = 1, from statement B. 

aiBn 


2 If it is desired to admit p = 0, 1, the existence of 
apectively must be postulated 


boundary points (z,0) or (0, y) re- 
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If we can show lim Z Pn(oi) = 0 under the condition of the theorem, 

n— ♦« a(A n 

the proof is complete For any point a t A„ , P n (a) = k n (oi)p v q n v which for 
fixed p is 0(1/Vn)- The sum over A„ is 0(A(n)/\/n) and therefore since the 
hypothesis of the theorem implies that A{n)/^/n attains arbitrarily small values 
for arbitrarily large values of n, the sum in question decreases monotonieally 
to zero. 

Corollary. If the number of accessible points of R of index n is bounded, the 
region is closed. 

That the condition given in Theorem 1 is not a necessary condition may be 
seen by examining the region R consisting of all points except points of the form 
(2x + 1, 2y + 1) and (3, 0) and (0, 3). 

Theorem 2. If R is closed and R contains S, S is closed. 

Proof. The proof is essentially similar to that of Theorem 1, 

Any reasonable estimate of p will be a function defined on the boundary points, 
because the boundary points constitute, so to speak, a sufficient statistic for p. 
That is, the probability of any path from (0, 0) given the boundary point a at 
which it terminates is independent of p, and is in fact 1/it (a). 

We shall construct an unbiased estimate of p for closed regions R, that is a 
function p(a), at B, such that Z v(a)P(a) = p (absolutely convergent). 3 

tX ill 

Construction. Let k*(a ) be the number of paths in R from the point (0, 1) 
to the boundary point a, and let p(a) = k*(a)/k(a). We remark that the defini- 
tions imply fc*((0, 1)) = 1, when (0, 1) is a boundary point. 

Theorem 3. For any closed region R p( a) is an unbiased estimate of p. 

Proof: 


Z PWPW 


- £ 


k*(a) 

k(a) 


k(a)p v q z 


= £ **(«)!» V- 

a eB 


If (0, 1) is a boundary point, then fc*(( 0, 1)) = 1 and Zc*(a) — 0, a ^ (0, 1), in 
which case the sum in question consists of the single term p. If (0, 1) is not a 
boundary point, consider the region R' obtained by deleting (0,1) from R, and 
k'(a), the number of paths in R ' from the oiigin to the boundary point a of R. 


k*(a) - k(a) - k'(a) 


£ 

atB 


k*(a) p v q x = Z fc(«)pY - Z k'(a)p u < 


= 1 -'Zk , (a)p v q x . 

at tB 


Now R' is closed (Theorem 2) ; except for (0, 1) every boundary point of R' ia 

3 Even if such a sum were p for a region which was not closed, we would not call the 
estimate an unbiased estimate 
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easily seen to be a boundary point of R ; and k'(<x) vanishes except for the bound- 
ary points of R', Therefore 

P + 12 &'(<*) P* 9* = 1. 

a tB 

and the proof is complete. 

It is clear from the construction that 0 < p(a) < 1; this is rather satisfying, 
since an estimate of p outside of these bounds would be received with some mis- 
givings. 

Theorem 3 may be generalized to yield unbiased estimates of linear combina- 
tions of functions of the form p l q provided the points (u, t) are not inaccessible 
points We need only let the point ( u , t ) play the role of (0, 1). Even though 
the point (u, i ) is inaccessible it may be possible to represent p ‘q u as a polynomial, 
none of whose terms correspond to inaccessible points. 

It is clear from Theorem 1 that p(a) is an unbiased estimate of p for the usual 
sequential binomial tests, but the computation may be quite heavy. It should 
be noted that the coordinate system used here differs slightly from the coordinate 
system customarily used in sequential analysis. The custom is to let the x 
coordinate represent the number of items inspected, whereas we use it to repre- 
sent the number of nondefectives, this is the only difference between the co- 
ordinates. We understand that in applications the customary procedure seems 
preferable, but we find the present coordinates more convenient for the purposes 
of ths article. 

In general p is not the only unbiased estimate of p. A necessary condition for 
uniqueness is that the region be simple, that is that all the points between any 
two accessible points on the line x + y = n be accessible points. In other 
words no accessible points of index n shall be separated on the line x + y = n 
by inaccessible points or boundary points. 

Theorem 4. A necessary condition that the estimate p be the unique unbiased 
estimate for the closed region R is that R be simple. 

Proof. For a region that is not simple we shall construct a function m(a) 
not identically zero, such that 


(1) Z-< m(a:)P(a) = 0. 

a«fl 

But p(ct) + m(a) will be an unbiased estimate of p different from p. 

Suppose we have a closed region R which is not simple. We consider the 
west index n where the accessible points are separated. There will be at least 

that Unm ei + niP 6 sequence of P omts between some pair of accessible points 
hat are not accessible points. It is easy to see that all the points of this un- 
mtemipted sequence are boundary points of R. Let this sequence be the points 

of m(a)°let mfl ) ival’ \ ’ n X ° + Vo ~ n ‘ To begin the instruction 
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the number of paths from a' to the boundary point a with the same convention 
if a.' is a boundary point. To complete the construction of m(a), let m(a) — 
— [/'(a) -f- ( — 1 ) t l"(a)]/k(oi) for boundary points not members of the sequence 
under consideration. Before proceeding to check equation (1), wc show that, 

(2) E V{a) p y q* = p*° ? *° +1 ; £ l" {a) p v q* - q^. 

Because of symmetry we need only carry out the demonstration for the first sum. 
If a! is a boundary point l'(a') = 1, and for all other points a l'(a ) — 0, and the 
sum is the single term p vo q xa+1 . If a! is not a boundary point consider the region 
obtained by deleting a! from if and the corresponding A/ (a), the number of paths 
fiom (0, 0) to the boundary points of the new closed region if'. Every boundary 
of if' except a' is a boundary point of if Let us extend the definition of fc'(a) 
to the whole boundary of if by defining k'(a) = 0 for a not in the boundary B' 
of if'. Then it is easy to see that 

t(ffl) = /c'(«')i'(a) + /o' (a). 

Now 


i = E *(«)pV 

atB 

= fc'(a') E l'(«)pV + E WpV 

atB ottB 

= *V) £ Z'(a)pY + 1 - fcWV 0+l 


establishing equation (2) 

We now check that m(a) satisfies equation (1): 

E m(a)k{a)p v q x = £ (- 1) J p w+ » g** -/ - £ V(a)p\ I - £ (~ 1)' i"(«)pV 

atB 1-0 ml „<» 


= £ (-i)y o+ y- J - p v °q xo+1 - (-i )yoii+y.~( 

= (-DV^ - 2 ,+1 - (-I)‘p ,+l ) 

= o. 


Theorem 5. A necessary condition that p(a) he a unique unbiased estimate of p 
for the dosed region if is that there be no closed region It' whose boundary is a proper 
subset of the boundary of if. 

Proof. Again supposing that the condition is not satisfied we shall con.Mnirt 
a function m(a) not identically zero such that equation (1) is satisfied U't 
k'(a) be the number of paths in if' to a in B of if, understanding, of course, that 
k (a) = 0 if a is not in 5' of if'. Consider m(«) = 1 - fc'(«)/ft(«), m(«) is not 
identically zero because k (a) vanishes for at least one a, but k(a) does not. 
From the closure of if and if' it is obvious that m(a) satisfies equation (1). 
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Two simple examples will suffice to show that neither simplicity nor the 
condition of Theorem 5 is alone sufficient to insure the uniqueness of p. The 
region consisting of the points whose coordinates are given in the following con- 
figuration and whose boundary points are 

x 

(0, 3) a 

(0, 2) * 

(0, 1) (1,1) * , * N 

(0, 0) (1, 0) (2, 0) (3, 0) x 

indicated by the z’s satisfies the condition of Theorem 5 but iB not simple. On 
the other hand the region consisting of all points for which y < 3, except for the 
two points (1, 0), (1, 1) is simple but does not satisfy the conditions of Theorem 5, 
because the region consisting of all points except (1, 0) with y < 3 can play the 
role of R'. 

The authors are unable to decide whether the two conditions together guaran- 
tee the uniqueness of p as an unbiased estimate of p, and supply the following 
sufficient condition which is adequate for many practical purposes. 

Theorem 6. A sufficient condition that a closed region have p(a) a unique un- 
biased estimate of p is that the region be simple and that there exist g, h (0 < g,h 1) 
such that for all boundary points | gx — hy j < M. 

Proof. If there were an unbiased estimate of p different from p, subtracting 
it from p would yield an equation of the form (sum absolutely convergent) : 

(3) Z m(«)pV = 0, 

fit iB 

where m(a ) is not identically zero. But this will be shown to be impossible/ If 
m(a ) were not identically zero, there would be an a 0 such that m(a o) ^ 0 and 
1) m(«) = 0 for all boundary points of index less than that of a 0 , and 2) one of 
the coordinates of ao is less than the corresponding coordinate of any other 
boundary point for which m(a) ^ 0. This follows easily from the simplicity 
requirement which implies that the boundary points of index n are broken into 
two sets a) those whose y coordinates are less than the y coordinates of the ac- 
cessible points of index n, and b ) those whose x coordinates are less than the x 
coordinates of the accessible points of index n.* Since the situations a) and b) 
are symmetrical we suppose without loss of generality that «o is a boundary 
point whose y coordinate is less than that of any other boundary point with 
m(a) r* 0. Equation (3) may be written 

’"Mp'V + p^ +1 2 m(a)p v ~ v °~ l q x = 0, 

a tB 
a/Aa 0 

* It will be seen as the proof proceeds that if there are no boundary points to which, 
alternative a) applies, the restriction g > 0 may be removed and replaced by g 0, simi- 
larly if there are no boundary points to which b) applies the condition h > 0 may be re- 
placed by h £ 0. 
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where the exponents appearing in the sum are nonnegative. But it will be shown 
that for sufficiently small p 

^ I m(ao) | > p I L w(a)p v " vo “V I , 

H|A«0 

which contradicts equation (4). Now 

(6) | 2 m(.a) V v -^\ x | < 2 | m(a) \ V v ~^ 

< 2 | m(a) | o+A+w+ex-M/e 

= <f ulQ 2 | m(a) | (pq^y^- 1 

< q~ iHv o+ 1)+2 "'/^S | m(a) \ p v ~ vt ~ 1 q x , 

where all the summations range over the values indicated in (5) . The summa- 
tion indicated in (5) is thus seen to be dominated by a convergent power series 
in pq ' ! “ . 

Thus Theorem 6 shows that p is a unique estimate for the sequential binomial 
tests. 

Theorem 7. A necessary and sufficient condition that p he the unique unbiased 
estimate of p for a closed finite region R is that R be simple. 

Proof. The proof follows immediately from Theorems 4 and 6. 

3. Applications and illustrative examples. 

A. Single sampling. In single sampling a random sample of n items is drawn 
from a lot containing items each of which is either defective or nondefective. It 
is customary to estimate p, the proportion defective by the unbiased estimate 
i/n, where i is the number of defectives observed. The boundary of the region 
defined by a single sampling plan consists of all points of index n. Now 

k((n — i, i )) = ^ y and k*((n — i,i — 1)) = (^ _ Consequently the unique 
unbiased estimate of p is 

the result above. 

It may be of interest to note that an unbiased estimate of the variance pq/ n 
of the proportion p, is (:::)/[(:>] - ~~i)’ ( n > l); this estimate 

is obtained by the method suggested immediately following Theorem 3. 

B. Curtailed, single sampling. In single sampling schemes, there is usually 
given a rejection number c as well as the sample size n, If c or more defectives 
are found in the sample the lot is rejected, but if less than c defectives are found 
in the sample the lot is accepted. It is customary to inspect all the items in 
the sample even if the final decision to accept or reject the lot is known before 
the completion of the inspection of the sample. One reason sometimes men- 
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tioned for this procedure is that an unbiased estimate for p is not known, when 
the inspection is halted as soon as a decision is reached. We provide the un- 
biased estimate in the following paragraph 

In curtailed single sampling the boundary points when rejecting are (x, c), 
c + x g n, when accepting (n — c -(- 1, y), y g c — I. The region is a rec- 
tangular array and obviously simple. The unique unbiased estimate along the 
horizontal line corresponding to rejection with c > 1 therefore is 


p({x, c )) 


CifryCt:: 1 ) 


c — 1 

c + x-l* 


or in words, one less than the number of defectives observed divided by one less than 
the number of observations. The unique unbiased estimate along the vertical line 
corresponding to acceptance for c > 1 is 


p((n 




c + i 


that is, the number of defectives observed divided by one less than the number of ob- 
servations. We reserved the case c = 1 because it is rather illuminating. The 
construction of Theorem 3 works as usual, and we note that p((0, 1)) = 1, 
p((n, 0)) = 0 as we might expect, but p((x, 1)) = 0, 0 < x < n. 

It is somewhat startling to find that the only unbiased estimate of p for cur- 
tailed single sampling with c = 1 provides zero estimates unless a defective is 
observed on the first item We remark that the variance of this estimate is pq. 
In other words, curtailed single sampling with c = 1 is no better for estimation 
purposes than a sample of size one when the unbiased estimate p is used, 

A limiting case of curtailed sampling when n is unbounded has been con- 
sideied by Haldane as a useful technique in connection with estimates of the 
frequency of occurrence of rare events The region would not be closed unless 
P ~ 0 were excluded In our nomenclature there is a "rejection number’' c 
(c > 1), and we continue sampling and inspecting until c defectives have been 
observed. The unbiased estimate 6 is (c - l)/(j - i), where j is the total num- 
ber ot observations, and of course this is the estimate given by Haldane. 

. , A r J en eral curtailed double sampling plan The following example will 
illustrate the sort of calculations involved in computing p for multiple and se- 
quential plans. A sample of size % is drawn and items are inspected until 1) 

y f n | defeCt i ves are found - °r 2) m - a + 1 (a =£ 0) nondcfectlvos are 
found or 3) the sample is exhausted with neither of these events occurring If 
case 3) arises, a second sample of size n 2 is drawn and inspection proceeds until 
a grand total of r a (n i r 2 ^ n x + n 2 ) defectives is found or m + n 2 - r % + I 


‘ J B, S. Haldane, Nature , Vol. 155 (1945), No. 3924. 
Bor the uniqueness, see footnote *. 
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nondefectives are found, In this scheme we call n and r 2 rejection numbers 
and a an acceptance number. The unique unbiased estimate p is as follows: 


(a) P( 0 '. n)) = j ITj’ J - 0, 1, • • * , ni n ; 

(b) P((ni - a + 1, »)) = ni _ a + > i = 0,l, 


(c) p((x, r t )) = 


A 


x 0 + 2/0 — 
Xo 


, / To + J/o\/ 

\ *» A 


l\/x - .To + r 2 — 2/0 — l\ 

/V r 2 - 2 /o - 1 / 

py. 


a — To + r 2 — i/o — 
r 2 — po 


tti — Ti < x 2s «i + Ha ; 


To + 2/o — 1 \/ni + — r 2 4* V — 3/o Xo 


(d) p((n 1 + n 2 -r 2 + 


, „ X^r’X 

x > v» /„ , .. \/~ 


y - Vo 


) 


( a rX 


Hi + Hi -*■ r 5 + V ~ 2/o ~ Xfl'\ 

V - 2/o / 

a < y S Hi + Ho ; 


uAere the summations extend from y 0 — a + 1 to 2/0 — — 1, nnd io + 3/o — Hi • 

In the above equations (a) and (b) are the estimates corresponding to rejection 
and acceptance on the basis of the first sample, while (c) and (d) correspond to 
rejection and acceptance when a second sample has been drawn. Rather than 
use the sums indicated in (c) and (d), some may find it preferable to make the 
estimation entirely on the basis of the first sample. If there is no curtailing, 
the procedure of estimation is equivalent to single sampling, and the estimate is 
again i/n 2 as mentioned in paragraph A above. If the first sample is curtailed 
and the estimate is made on the basis of the results of the first sample only, the unique 
unbiased estimate is given by formula (a) when rejecting, by formula (b) when ac- 
cepting, and by i/m when a second sample is to be drawn. It will be noted that 
(a) and (b) are identical with the expressions derived in paragraph B over the 
range of values for which they arc valid. 

D The sequential probability ratio test. Using the nomenclature of sequential 
analysis, 7 the criterion for a decision is given by two parallel straight lines in the 
da-plane 

(7) di = hi + sn (lower line) 

da = hi + sn (upper line), 

where d is the number of defectives and n is the number of observations. The 
acceptance and rejection numbers for any n are given by a„ and r „ , respectively, 


T See, for example, Sequential Analysts 0 } Statistical Data: Applications, Section 2, 
Columbia University Press, 1945 
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•where a„ is the largest positive integer less than or equal to di , and r„ is the 
smallest integer greater than or equal to di . We let k a (n) be the number of 
paths from the origin which end in a decision to accept on the nth observation; 
k r (n) is similarly defined when rejection occurs on the nth observation. We 
also require an auxiliary sequential test with acceptance and rejection numbers 
a' n _ i = a n - 1, r„ - 1 (which is equivalent to replacing hi and hi by hi + 
1 - s and h - 1 + s in the equations (7)), and with k' a (n) and k' r (n) the number 
of paths from the origin which lead to acceptance or rejection on the nth observa- 
tion for the new test. A graphical comparison of the two plans shows that: 
The unique unbiased estimate oj p is 

p(n) = ki(n - 1 )/k a (n) 
when the original test leads to a decision to accept, and 

Pin) = k' T (n - 1)/Jt„(n) 

when the original test leads to a decision to reject on the nth observation. 

E. Regions with narrow throats. Let us consider the case of a closed region 
which has only one accessible point of index n, n > 0 (n being the lowest index 
not zero at which this phenomenon occurs). The number of paths from the 
origin to this accessible point a' we will denote m, while the number of paths 
from a' to a, boundary points of index greater than n, will be denoted 1(a), 
Then the total number of paths to a from the origin is ml(ai). We use the con- 
struction preceding Theorem 3 to get p(a). The number of paths from (0,1) to 
« is similarly m*l(oi), so for such points p(«) = m*/m. In other words, if a 
closed region has a narrow throat such as that described, p(«) for a of index 
higher than that of the accessible point a' are independent of the shape of the 
region beyond the line a; + y = n, and in fact they are all identical. The cur- 

ta ed single sample with c ~ 1 is a particular case of a region with a narrow 
throat. 


4‘. Esti mation based on data from several experiments. In the previous dis- 
cussion we have been concerned with estimation based on the result of a single 
experiment. Various kinds of acceptance sampling plans have been suggested 
as exampfes of the possible experiments. Acceptance sampling is one of many 
activities where data toward the estimation of p are often accumulated in a series 

’ aS , been pointed 0ut ^ John T «key that when information 

7 from several experiments the estimate p will no longer be the unique 

Si ion T 01 Li “ e h8S b ““ d °“ problem of combinlg 

fverT ZL 7 7 enment3 ’ but t0 iUu8trate tba point, we will discuss 
a very simple example in terms of acceptance sampling 

the ifotaXSS™ I”' 8 " tta S4m * 8lM are ins I* cte <i according to 

second obseno.fi c ^niplnifi plan: if a defective occurs at the first or 

— Z22XZZ ie re)ected; “ the “ *" 
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The total number of defective and of nondefective items in the two samples 
form a sufficient statistic for p. In a single application of the sampling plan 
the boundary points with their probabilities are (0, 1), p; (1, l)t P<U (2, 0), if. 
From this information we can generate the possible totals of defectives and of 
nondefectives which may arise when samples are drawn from two lots, with their 
probabilities by expanding 

(8) (p + pq 4 5 2 ) 2 = P 2 + PY + 2 p 5 g + 2 pg 3 + 2 pq* t 

where a term on the right of the form mp v (f is the probability that in two samples 
there will be x nondefectives and y defectives altogether. On the basis of the 
observed number pair (a:, y ), which may be regarded as a possible terminal point 
a for the two experiments performed successively, we wish to form an unbiased 
estimate e((x, y)) = e(a). For the estimate e to be unbiased the condition 
2 e(a)P(a) = p must be satisfied, where in the present example the P(a) are the 
six terms on the right of equation (8), and the e(a) are the estimates with which 
the six probabilities are associated. 

In the example under consideration the condition for unbiasedness will be 
satisfied if and only if e((0, 2)) = 1, e((4, 0)) = 0, e((l, 2)) = c((2, 1)) ~ 

[1 - e((2, 2))]/2, e((3, 1)) = e((2, 2))/2. Consequently a one parameter family 
of unbiased estimates is available, Unfortunately the popular condition that 
the variance be a minimum depends on the true value of p; in fact the variance 
is minimized just when e((2, 2)) = 1/(2 4 p). So an unbiased estimate of uni- 
formly minimum variance does not exist. In practical applications to accept- 
ance sampling one might meet this difficulty by choosing a value of p near zero 
for such a minimization scheme. 

However it is clear that the last word has yet to be said about how best to 
estimate p when one is faced with the results of several experiments, 

6. Conclusion. We would like to call attention to a few problems raised by 
but not solved in this paper: 1) find a necessary and sufficient condition that p 
be the unique unbiased estimate for p; 2) suggest criteria for selecting one un- 
biased estimate when more than one is possible; 3) evaluate the variance of p. 

In this connection, m a forthcoming paper by M. A. Girshick, it will lie shown 
for certain regions, for example for those of the sequential probability ratio teat, 
that the variance of p(a), 

4 > m/E{x 4- y), 

where E{ x 4 y ) is the expected number of observations required to reach a 
boundary point. 



DISTRIBUTION OF SAMPLE ARRANGEMENTS FOR RUNS 
UP AND DOWN 

By P. S, Olmstead 
Bell Telephone Laboratories, Inc. 

1. Summary. Using the notation of Levene and Wolfowitz [1], a new 
recursion formula is used to give the exact distribution of arrangements of n 
numbers, no two alike, with, runs up or down of length p or more. These are 
tabled for n and p through n = 14. An exact solution is given for p > nj 2. 
The average and variance deteimined by Levone and Wolfowitz are presented 
m a simplified form. The fraction of arrangements of n numbers with runs 
of length p or more are presented for the exact distributions, for the limiting 
Poisson Exponential, and for an extrapolation from the exact distributions. 
Agreement among the tables is discussed 

2. Introduction. Assume that 




represent a series of repetitive measurements. In engineering work, experience 
has shown that, when the values of these measurements exhibit changes in level, 
trends, cycles, etc., it is usually indicative of the presence of findablc causes. 
In general, the engineer becomes more confident that a findable cause exists 
for a change in level, a trend, or a cycle, when the change is large, the trend is 
long, or the cycle is regular. 

On the basis of this experience, the engineer selects particular measures of 
change in level, length of trend, etc., to guide him in deciding when it is profitable) 
to look for a cause. Having selected the measure, he is interested in knowing 
how often he may have to look for a cause that does not exist. One such measure 
is the length of the longest run up or down m a sample of n values. The chart 
in Figure 1, based on the analysis given here, applies when no two values are 
alike and indicates the fraction of all nonidentical arrangements that have 
runs up or down of length p or more. 


Attention is directed to the distribution of sample arrangements that have at 
least one run up or down of length p or more. The distribution and the vari- 
ances and covariances for lengths of runs up and dawn are given by Levene and 
W fowitz [1] In addition, Wolfowitz [2] has shown that the limiting distribu- 
tion for a particular length of run up or down is a Poisson Exponential. 

The notation of Levene and Wolfowitz [1] will be used. Thus, let m , a, 

• • ■ , a n be n numbers, no two alike, and let the sequence S = (A, As ■ ■ ■ A ) 
be any permutation oj f d , a,, ■ ■ ■ , a n , where S is to be considered ’a chance 
variable, and each of the n! permutations of m , a s , ■ ■ , a„ is assigned the same 
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probability. Consider the derived sequence R whose ith element is the sign 
(+ or -) of h ,+ 1 - h, , {i = 1, 2, • • , n - 1) A sequence of p consecutive. + 
signs immediately preceded by a — sign is called a run up of length p or more; 
a sequence of p consecutive — signs immediately preceded by a + sign is called 
a run down of length p or more. When such a run is both immediately preceded 
and immediately followed by an unlike sign, it is a run of length exactly p. 
The distribution of arrangements with at least one run up or down of length 
p or more is considered under five specific headings: 



Fig. 1 


1. An exact numerical solution for n small, i.e., computations have been 
completed up to and including n = 14. 


2. An exact solution for p > -- . 

u 

3. A limiting solution for — *- ■ ' 


= constant. 


n 

4. An extrapolation from n small. 

5. Constant probability relationships. 


3. Solution for n small, Starting with a single number, ai , ft second number, 
1> ui , may be placed before or after it to obtain the two independent arrange- 
ments of one run of length exactly 1. A third number, 03 > oj > m , may lx> 
placed before, between, or after the preceding pair to obtain two independent 
arrangements of one run of length exactly 2 and four of two runs of length ex- 
actly 1. Continuing this process it is seen that, on the assumption that the 
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distribution of independent arrangements for (n - 1) numbers, ai < a% < a, < 
. . . < a n l i is known, the distribution of independent arrangements for n 
numbers, <n < a» < <n < • • • < a n> can be found by using the following re- 
cursion formula: 

Fn[r,,_i , r„_ 2 , • • • j i'h i ' ' ‘ i i " ' t r i > ' ' ‘ i ri l 

= Z Oi-i + l)F«-i[»V-s , r„_a , * ■ • , {r< - 1), (r.-i +1), ■ • ■ n] 

.-2 

-f* 2 F„_i[r„_ 2 , r„_a , ■ • ■ , (fi 1)] 

/I) n-3 »-l 

+ 2 E E (r» + 1) 

t»<2 7—1 

• Fn-^Xn-i j • • * i (r A _,+j -)- l)j ■ • ■ > (fi ~ ' 1)> ■ ' ‘ i (?j l)i j (^"i 1)1 

+ Z (n + l)Fn-itr„-3 , • • • , (r*_» + 1), • • • , (r, - 2), • • • (n - 1)] 

i-i 


where r, , etc., represents the number of runs either up or down of exactly length 
i in each arrangement of the n numbers designated F n , 

(2) S"Ji V, = r, the total number of runs having lengths exactly i (from 

1 to n — 1) for each arrangement included in F n , 

(3) = n — 1, that is, the sum of the lengths of all such runs in any 

arrangement is one less than the total number of 
numbers, 

f^n[rn— i , r„_ 2 , • * * , Th , * * ■ , r, , * * * » ry , • * * , ri], 

the total number of nonidentical sequences of the n 
numbers with exactly r n ~\ runs'-of length exactly (n — 1), 

■ • ■ n runs of length exactly h, • • • runs of length 
exactly i, ■ ■ ■ r, runs of length exactly j, ■ • ■ rj runs of 
length exactly 1. Some of these r’s are of courso zero 
and their sum is that given in (2) above. Similar 
statements apply to the four F„_i’s. 

In the last two summations in (1), when r , = n , (r, - 1) combines with 
(ti - 1) to give (n - 2), and when r, = n , (r< — 2) combines with (n — 1) to 
give (n - 3). 

By using the above recursion formula, the exact number of arrangements with 
at least one run up or down of length p or more has been computed for n ~ 2 
ton = 14, inclusive. This information is given m Table 1. In addition, it 
has been used to determine the probabilities of arrangements with runs up or 
down of length p or more as shown in Table 2. These tables provide a useful 
background for the limiting expressions considered in the next three sections. 



TABLE 1 

Exact Numbers of At rang ements ofn numbers with Runs of Length p or More 
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n n 

4. Solution tor p >~, When p > ^ , it is dear that no sequence run contain 

more than one run of length p. Thus, the expected number of runs of length 
p or more in an arrangement is also the probability that an arrangement contains 
runs of length p or more. Writing Levene and Wolfowitz’s [1] expression (4.2) 
in the simplified form previously published [3], we have. 


(4) P(r ' p ) = E(r' f ) = — ~ 1] 


for 


< p < n, 


where r' T represents the number of runs of length p or more 1 . This expression 
checks exactly with Table 2 over the range to which it applies. 


6. Solution for — —- 1 — 
n 


= constant. 


As mentioned above, WolfnwiU >2J 


has shown that the limiting distribution for runs up and down is a Poisson 
Exponential. His proof applies specifically to the distribution of runs of length 
exactly p. However, the assumptions made in his derivation could have I kwh 
applied to the distribution of runs of length p or more and would have led to 
identical conclusions for such runs. To see how closely this is approximated, 
it is possible to throw expression (4,17) for the variance of (r'„) derived by Invent’ 
and Wolfowitz [1] into the following simplified form: 


Y ? ) = M n ~ P)(P + !) + !) [ 1 _ 2 (p + l) 5 [0p* + 7 (p - 1)J 

l (p + 2)1 L (p + 2)!(2p + 3)(2p + 1) 


(5) 


, 4(p + 2) 

(2p + 3)1 


n + r (p + i)[(ap + 3 mp ~ i) 
u L pKp 4- 2)l(2p + 3)(2p + 1) 


B) 


+ 


(2 p + 3)! J/ L 1 ~ 


-1 


+ 


1 


L 4- 1 

L(p!) a (2p)i. 


J. _ p 

s , . / p> C2pru _ 

Thus, a (r,) is equal to E{r P ) within one part in one thousand for p > 7 and it is 

? at the * two moments approximate those of a Poiwon Kxpntirn- 
tial. Making use of this information, it is possible to prepare Table 3, which 

Kh™ re te b t d es 0 : f the probaMities ° f — ** - * 


( 6 ) 


P(r p ) 


1 - e~ 


— l — e 


Comparison 0 f T»bl« 2 aod 8 shows agreement to closer tlum .0001 f,.r p > 0 , 

closer agreement may be expected as p is increased. ’ ? 2 mdltatm K that 
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6, Extrapolation from the exact solution for n small. Since the exponential 
in equation (fi) may lx- written in the form: 

(7) t ,!f "* 'rw ^ e «lp(o+n~-mf{p+s)i _ e -n>«(p-ii»/tp+2>i 

it follows t hat : 


1 - P»(rV) 


TABLE 3 

Frartwu <•/ Arrangementt uf n numbers with Rum of Length p or More Based on Poisson 

Exponential 


\ 
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1 

r 

s 

■1 

» 

6 

7 

8 

9 

10 

>10 


7321 

lixJ*l 

1 
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2K15 

oirc» 
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<«CKt 

1220 

.(Wl 
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5031 

1393 

.0105 

.0004 
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071 1 
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. 1019 

0301 
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*» 

' 

•JMW 

7301 

.2107 

0435 

.0052 

.0004 

.0000 





h 


7017 

mi 

.0507 

.0076 

.0007 

.0001 

.(XKK) 






nun 

34'N 

.0097 

,0090 

.0011 

.0001 

.0000 

.0000 



HI 

ansa 

‘»712 

3X33 

,0X25 

.0122 

,0014 

.0001 

.0000 

.0000 

.0000 


11 

Wll 

•XMlt 

•1230 

. 0952 

.0146 

.0018 

.0002 

.0000 

.(XXX) 

0000 

.0000 

12 

0995 

ten 

•HWt 

.1076 

.0169 

.0021 

.0002 

.(XXX) 

.0000 

.0000 

.(XXX) 

13 

m.f! 

0112 

4051 

.1200 

.0193 

.0025 

.0003 

.0000 

.0000 

.0000 

.0000 

li 


0542 

527(1 

.1321 

.0210 

.0028 

.0003 

.0000 

.0000 

.0000 

0000 

13 


‘M13 

.6381 

.1411 

.02110 

.0032 

.0004 

.(XXX) 

.0000 

,0000 

.0000 

20 

1 .fXJUI) 


.axil 

.2015 

.0355 

.0040 

.0000 

.0001 

.0000 

.(XXX) 

0000 

■10 

1 

smt 

,0105 

.3052 

.0803 

.0118 

.0015 

.0002 

.(XXX) 

0000 

.0000 

GO 

M 

1 .0000 

. 07M) 

.5119 

.1231 

.0186 

.0023 

.0003 

.0000 

.0000 

oooo 

so 


H 

.0912 

.6530 

.1639 

.0254 

.0032 

.0004 

0000 

.0000 

.0000 

100 


*i 

.9985 

.7371 

.2030 

.0322 

.0041 

0005 

.0000 

.0000 

0000 

200 

li 

ir 

I .DOCK) 

,0315 

.3717 

.0652 

.0085 

.0010 

.0001 

.0000 

.0000 

mi 

“ 

! 

it 

.0900 

6024 

.1677 

.0215 

.0024 

.0002 

.0000 

.0000 

um 

li 

ii 

tl 

1 (XKK) 

.0005 

.2010 

.0428 

.0010 

.0005 

.0000 

.0000 

ftooo 

*• 



fi 

1.0000 

.8234 

1070 

.0246 

. 4)025 

.0002 

.0000 


showing that consecutive values of l — P(r' p ) are related by a constant of pro- 
IKirtkmaUty dependent only on p. Since this is true in the limit, Table 2 was 
examined to determine, similar multipliers for extrapolation. The results of 
this examination are shown in, Table 4 together with the values of (8) This 

table shows that the agreement between the value of j _ y for7t = 12 > 

e.g., and becomes closer the larger the value of p. The con- 
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stancy of the ratio for a given value of p is such as to permit calculation of 
probabilities for any value of a to a minimum of three or possibly four decimal 
places. Hu eh calculations have been made and recorded in Table 5. The fol- 
lowing formulae 1 were used for these calculations: 

Pn(r't) ® 1 

l\(n) « 1 - (.00437304)^^"-“ 

(9) P M ( r',) ** 1 - (.45093729) (.92404) n - 1! 

P«(n) » 1 - (.87587019) (.98681)""" 

P»(K) - 1 - (.98060695) (.99760) n_H 
PM) » 1 “ (.99752014) (.999652) n ~ 13 

or in general 

(10) PJ/ P ) - 1 - (1 - P n( (r;)](Constant p ] n “ n " . 

Comparison of Table 3 with Tables 2 and 5 shows that the difference for given 
p and n has a maximum for each value of p and that this maximum decreases 
with increase in p. The maximum values of the difference shown in the tables 
arc: : p l,n -» 2, .2679; p * 2, n *• 6, .1691; p = 3, n — 20, .0572;p = 4, n = 80, 
.0154; p 5, n = 500, .0033 ; and p « 0, n » 5000, .0007. Thus, it is apparent 
that t he agreement beyond p « 0 should be within .0001 and the method of 
Section 5 used for Table 3 is satisfactory for these probabilities. 

7. Constant probability relationships. From TableB 2, 3 and 5, it is pos- 
sible to make interpolations for the values of n required to have a probability of 
at least. P(r' p ) that an arrangement will have a run of length p or more. When 
the conditions of Section 5 apply, the value of n is, of course: 

(11) n « p - — ^ - P -~ pi log. [1 - P (/,)]. 


1 It will l>e noted that the constant for p « 2 has been taken to be - , whereas the last value 

r 

shown in Table 4 is .53661959, However, alternate values in this series are converging. 

2 

Comparing these subseries shows that by n » 16, the values would agree with - to eight 

V 

2 

decimal places. An analytic proof that ~ is the limiting value of the constant has recently 

If 

been found by J. W, Tukey, 

While reading the manuscript J. Riordan observed that the number of arrangements 
with longest length 1, say /(a, 1) has the generating function, 

I/(n, 1) — ; ■» 2(seo l + tan t) 
n I 

hence is twice the Euler number for n even and twice the tangent number for n odd, a result 
given essentially by Net to (4). These observations lead directly to the limiting value, 
2 

-noted above. 
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TABLE 5 


Fraction of Arrangements of n Numbers with Runs of Length p or More lutr.nl „n 
Extrapolation mlh Extrapolation Constant 



.0217 
. 021 J 
.0358 
.OHIO 
, 1211 
. I<i52 
.2044 
. 3743 
. ii!if>7 

. ma 

1.0000 


JHVJH 

jxm 
. 00(0 
.0I1K 
.01K7 
.0255 
.0322 
. 0053 
. 1580 
. 2025 
.8241 


Sample Size for Constant Probability Based on Poisson Exponential 



1 

2 

3 

4 

5 I 

6 ' ? , S 

<•99 

7 

20 

71 

335 

1939 ; 

13208 

<•95 

5 

13 

47 

219 

1203 i 

8033 " 

<•90 

3 

10 

37 

109 

071 

0037 

<.10 

0 

2 

4 

11 

49 j 

309 ( 2200 ! 

<•05 

0 

1 

3 

7 

20 ! 

153 j 1170 ' 10350 

<•01 | 

0 

1 

2 

4 

9 ! 

34 235 i 2030 


« 


TABLE 7 



(i 


13230 
HO 14 
0022 
308 
153 
34 
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Similarly, it may be obtained from the extrapolation formulae of Section 6 
in the form : 

(12) n = no + ~ jj: ~ ^ > "o( r p)l 

log [Constant,,] 

Results of computations based on (11) and (12), are given in Tables 6 and 7, 
respectively for particular values of P(r' p ). It will be noted that Table 7 is in 
exact agreement with Table 2 and that it differs but little in a practical sense 
from Table G. 
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THE THEORY OF UNBIASED ESTIMATION 
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1. Summary. Let F(P ) be a real valued function defined on a tP of 

the set 3* of all probability distributions on the real line. A function / of « real 
variables is an unbiased estimate of F if for every system, A\ , • ■ • , A'„ , of in. im- 
pendent random variables with the common distribution the expectation 
of f{Xi ■ * ■ , X„) exists and equals F(P), for all P in if'. A irnwary ami suffi- 
cient condition for the existence of an unbiased estimate is given (Theorem 1), 
and the way in which this condition applies to the moments of a distribution is 
described (Theorem 2). Under the assumptions that this condition is satisfied 
and that 3> contains all purely discontinuous distributions it is shown that 
there is a unique symmetric unbiased estimate (Theorem 3); the most general 
(non symmetric) unbiased estimates are described (Theorem -1); and if, m 
proved that among them the symmetric one is best in the sense of having (he 
least variance (Theorem 5). Thus the classical estimates of the mean and the 
variance are justified from a new point of view, and also, from the theory, com- 
putable estimates of all higher moments are easily derived. It is interesting to 
note that for n greater than 3 neither the sample nth moment about the sample 
mean nor any constant multiple thereof is an unbiased estimate of the nth mo- 
ment about the mean. Attention is called to a paradoxical situation arising in 
estimating such non linear functions as the square of the first moment. 


2. Introduction. Consider the set 3)* of all probability distributions on the 
real line. The elements P of 2)* may be regarded as either net functions P(E) 
defined for all Borel subsets E of the real line, (probability measures) or mono- 
tone non decreasing functions P{x) of a real variable x, (cumulative distribution 
unctions). Suppose that F = F(P) is a real numerically valued function of 
distributions. For example F{P) may be the expectation or the standard devia- 
tion of the distribution P, or it may be the amount of probability P assigns to 

EEtaJ * ■ , Tl ? P ;° blem of unbia8ed «*“*» 18 to find a function 
(statistic) of a sample of n from a population with distribution P, in such a way 

in P L xpected v f ueof thl9 ^notion is equal to the value of F{P) identically 

estLtfoTordm Tove / ® of then ™ unbiased 

estimate ot order n over 2) is a real valued function/ » /te ,. . 0 < n 

variables, which is such that for every svstern Y r ,/r i 

dom variables with th* « j- /n 8 . ’ ' 1 - of independent ran* 

* - *>. * «*— 

m*SZ£i*£2r& wL'tt’T m , , w “ 

mates of a given function F(P)> mir 5?“* al1 p0ss,bIe Unbiased eati- 
8 0n F{P)} (III) Is there a reasonable definition of "best 
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unbiased estimate” which enables one to select from all unbiased estimates of a 
fixed function F(P) a unique best one? 1 

I shall present below a complete solution of these problems, under the assump- 
tion that the domain of estimation, 2), is sufficiently large. The results also 
shed light on some classical concepts. It is possible, for instance, to exhibit 
computable unbiased estimates for all moments of a distribution about its ex- 
pected value, and to prove that the known estimates of the expectation and the 
variance are essentially unique. 

The vague concept of sufficiently large estimation domain 2) is easily made 
precise. For any Borel set E on the real line let 2)*(F) be the set of all those 
distributions which assign the probability 1 to some finite subset of E. Thus, 
for example, if E consists of exactly two points then 3)*(E) is the set of all possible 
probability distributions in a dichotomy. A subset 2) of 2)* will be said to be 
finitely closed over E if 3)*(E) £ 3). Finitely closed domains are “sufficiently 
large,” 

It is clear that some restriction (from below) on the size of 2) is essential for a 
discussion of the characterization problem (II) and the uniqueness problem 
(III). For if, for example, the domain 2) is artificially restricted to contain 
only one distribution, then there will always be a plethora of completely un- 
related and uninteresting solutions of the problem of unbiased estimation, none 
of which can be Baid to be preferable to any other one. It is true, however, that 
the assumption of finite closure is too restrictive. The general problems of 
unbiased estimation are still unsolved over such interesting and useful domains 
as the set of all continuous distributions, and the set of all absolutely continuous 
distributions. There are also more special problems connected with special 
classes of distributions (e.g. the normal and the rectangular distributions), as 
well as the general problem of characterizing the domains which are sufficiently 
large to make a uniqueness theorem possible. I hope to return to these problems 
in the near future, 

3. Existence. A function E(P), defined on a domain 2) £ 2) + , will be called 
homogeneous over 2), of degree fc = 1 , 2, ■ • • , if there exists a real valued func- 
tion ip = <p(x i ,•>■,#*) of k real variables which is such that for every P in 2) 
the Lebesgue-Stieltjes integral* 

f ■ • ■ J ,x k ) dP(x i) > • • dP( x k ) 


>My interest in these problems stems from conversations and correspondence with 
Reinhold Baer, who first called my attention to the problem of finding unbiased estimates 
for the moments about the expected value, The general questions of existence and 
uniqueness of unbiased estimates were raised explicitly by J, P. Steffensen in a footnote 
on p. 18 of his book, Some Recent Researches in the Theory of Statistics and Actuarial Science, 
Cambridge Univ. Press, 1080. 

5 All integrals in this paper are to be extended over the entire Euclidean space of in- 
dicated dimension. 
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exists and is equal to F(P), and if the integer k is minimal with respect to the 
property of the existence of such a representation. 

Theorem 1. A necessary and sufficient condition that F have an unbiased esti- 
mate of order n over 3) is that it he homogeneous over 3) of degree k < n. 

Proof. To prove sufficiency, suppose that 

F(P) = / ’ " / K>(*i . • * ■ i **) dP(xi) • • • dP(«*) 


for all P in 3), with ft < n. Define f by 

f(%i j • * * , %k , aJfc+i > * ■ 1 , %n) 33 * * * * , %k ) . 

Then if X k , - • ■ , X n are independent random variables with the same distribu- 
tion P (belonging to 2)) 

ElfiXu ... , *„)} = f //(*!,•••, *„) dPfa) ... dPM 

= / * • • / , ‘ • • , %k) dP{xi ) • • • dP(x n ) 

= / ‘ “ / rt Xi > " ' > **) dP(x i) ' • • dP(Xk) « F(F). 


The necessity of the condition is even more trivial : the definition of an unbiassed 
estimate of order n is such that the existence of ono is equivalent to homogeneity 
of degree < n. 

As a special case, and an important illustration of how the degree ia evaluated 
consider the moments F m = F„(P) of a distribution P about the origin, 

K(P) = / x m dP(x), 

and the moments P m {P) about the expected value FfP), 

K(P) = J (x - Fi(P)) m dP(x). 


Theorem 2. ^ „ 

each of the functions Pi , ■ • ■ , e r , ana finitely closed over (0, 11 
denotes Vic set containing the two numbers 0 and 1 only), and if k 
arbitrary non negative integers , then the function 


i/2) is any subset of 3)* contained in the domain of definition of 
' > F r , and finitely dosed over (0, 1] ( ivherc |0, 1} 

’ * » k, are 


F(P) = (p) ... ft(P) 

ts homogeneous over 3) of degree exactly * = &!+•-• + &, 
Proof. The representation of F by a fc-fold integral, 


F(P) 


/-/ 


*i • ■ ■ x k x tl+1 




*£.+ • -+*r dP fa) dP( Xk ), 



UNBIASED ESTIMATION 


37 


shows that F is homogeneous of degree < k. That the degree of F is indeed 
equal to k is proved as follows. Suppose that 

F(P) ~ J j v(*i i * * • , %h) dP(x i) * • ■ dP(xh ) 

for all P in 3). Observe that if P is the singular distribution which assigns prob- 
ability 1 to the point 1 on the real line then the identity of the two representa- 
tions of F reduces to ip( 1, •■•,!) = 1; similarly assigning the total probability 
to 0 implies that p(0, • * ■ , 0) =0. More generally, choose P so that it assigns 
the probability p, (0 < p < 1), to the point 1, and the probability q = 1 — 
p to 0. It follows that 

p k = P h + P h \ <Pi + • • * + PQ h ~ l , 

where is the sum of all ip(xi , • • • , x h ), over those ^tuples (xi, • ■ , x h ) which 
contain exactly i 0’s and ( h — i) l’s If q is replaced by 1 — p in the right 
side of the last equation, the resulting equation is supposed to be satisfied by 
all p, 0 < p < 1. If, however, h < k, then the two sides of the equation are 
polynomials of different degrees; hence h > k. 

Corollary. If 3) is any subset of 3)* contained in the domain of definition of 
the function F m and finitely closed over {0, 1} then P m is homogeneous over 3) of 
degree exactly m and, consequently, it has unbiased estimates over 3) of order n if 
and only if m < n. 

Proof. Since 


P m (P) = / (x - Fy(P)) m dP(x) 

- E7-0 (-l) ; '(^Fi(P) / dP{x) 

= E7-0 (-i/(^)pi(P)P m _ 3 .(P), 
the conclusions of the corollary are implied by Theorems 1 and 2. 

4. Symmetry. Theorem 1 may be regarded as a solution of the existence 
problem (I). An examination of its proof shows, however, that the estimates 
there constructed are very unsatisfactory indeed. In the special case F = F\, 
for instance, the estimate becomes f(xi , • • • , x n ) = x x . The first element of a 
sample of n is, to be sure, an unbiased estimate of the expectation of the dis- 
tribution, but it is intuitively clear that, since it ignores most of the information 
at hand, it is not a good one. In order to exhibit the best estimates it becomes 
necessary to study the symmetric ones. Recall that a function f = f(x ± , ■ ■ • , 
x n ) is symmetric if it is invariant under all permutations of its arguments. The 
proof of the main theorem of this section, the theorem of uniqueness for sym- 
metric unbiased estimates, is based on two lemmas. 
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Lemma 1. If Q = Qip 1 , • • • , p n ) is a homogeneous -polynomial of degree > 
0 inn real variables, such that whenever 0 < p, < 1, i = 1, • ■ ■ , n, and pi -f- • • > 
+ p* = 1 then Q (pi , ■ • • , p„) = 0, then Q must he identically zero. 

Proof. (Induction on n .) For n = 1 the lemma is trivial. Aeaiunr- there- 
fore that n > 1 and that the lemma is true for n — 1. Observe that the hypoth- 
esis is equivalent to the vanishing of Q for all systems of non negative arguments 
(without the restriction pi + • • • + p„ = 1), since any such system [p,| can be 
replaced by {p./(pi +••• + ?■)). If b Q the variables p* , • > ■ , p„_i arc given 
any non negative values, then the hypothesis implies that the resulting poly- 
nomial in p n vanishes for all non negative values of p„ , and therefore identically. 
Consequently the coefficients of the powers of p n in Q, which are. themselves 
homogeneous polynomials in pi , • • • , p„_i , vanish for non negative arguments 
and therefore (by the induction hypothesis) identically.* 

Lemma 2. If 2) is a set of distributions finitely closed oner a Borel set E of the 
real line and if the symmetric function fix i , • ■ • , z„) is such that for every dis- 
tribution P in 9) the Lebesgue-Stieltjes integral 

/ • • ‘ / fix i . • • • , *») dPix 0 • • . dP(x„) 


exists and has the value zero, thenf[xi 0 whenever x, tE,i *» 1, ■ * • , n. 

Proof. Consider any point (x\ , ■ ■ ■ , x\) with x\ * E, i - 1, ■ ■ n, and any 
distribution P (in 2)*(B)) which assigns the probability 1 to the subset jr? , . . . 
x„) of E. If the probability of is pi , i = 1, • • • , n, then the integral 




, x n ) dPixi) • • • dPix n ) 


is a homogeneous polynomial (of degree n ) in the n variables pi , * * * , p„ . The 
hypotheses of Lemma 1 are satisfied-it fohows that this polynomial vanishes 
identically. The symmetry of / implies that the coefficient of the term 
is exactly n /(* i , ■ • ■ , x n ), thereby establishing the conclusion of the lemma 

v - <p{xi, ■■ •_,«») is any function of k real variables and if n is a positive 
integer, n > k, it is convenient to write 


_ f nl / x 

v - <P (xi 


for the average of the values of <p over all points obtained from (it, , 
extractmg ordered subsets of k x’s. Thus, for instance, 


» *-) by 


^ lX *) 1 " = $ (ZlZj + XiX, + xa,) 


and 


w) ■'»(*!+■■•+ X n ). 


I am indebted to J B Rosaer fl 7 n,.,, ■ . . . 

Lemma lwaa more complicated. Walker f° r tbiB proof; my original proof of 
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Theorem 3. Let 3) be a set of distributions fimtely closed over a Borel set E of 
the real line and let F be a homogeneous function of degree k, 

F(P) = I J <p (% i ,••*,**) dP(x{) • • • dP{x k ) 

over 3). If f{x i ,•••, x n ) is a symmetric unbiased estimate of F over 3), of order 
n > k, then for every point (x x , • ■ • , x n ) with x t e E, i = 1, • • • , n, /(: r v , • • • , 
x n ) is equal to the symmetrized function ip ln] (xi, ■ • • , x n ). 

Proof. Observe first that 

J ■■■ J <p(*i , • ■ , x k ) dP(x 0 ■ • ■ dP(x k ) 

remains invariant if (xi , • • • , xf) is replaced by Or,, , • • ■ , x n ), where {ii , ■ ■ • , 
ik\ is any subset of (1, ■ ■ ■ , n\, since the change is merely a matter of notation. 
It follows that 

F(P) =/•■•/ <p( x i . • • • > x k) dP(x i) • • • dP(x k ) 

= /•••/ P M (*i ,■■■,*«) dP(x i) ■ • • dP(x n ), 

so that ip [n] is indeed an unbiased estimate of F. Since y n] is also symmetric, 
/ — <p [n} satisfies the hypotheses of Lemma 2, and the desired conclusion follows 
from an application of that lemma. 

6. Characterization. For any Borel set E on the real line let 3)*(J2) be the 
set of all those distributions which assign the probability 0 to the complement 
of E. Thus, clearly, 3)*(I?) £ 3)*(E); if E is the entire real line then 3)*(E) = 
3)*; if E consists of a finite number of points then 3)* (E) = 3)*(E). 

Theorem 4. Let 3) be a set of distributions finitely closed over a Borel set E 
of the real line and contained in 3)*(E), and let F be a homogeneous function of 
degree k, 


F(P) = /■■■/ <p(*i ,•••,**) dP(xi) ■ ■ ■ dP(x h ) 

over 3). A necessary and sufficient condition that the function f = f(x y , ■ • ■ , x n ) 
be an unbiased estimate of F over 3), of order n > k, is that the Lebesgue-Stielljes 
integral 


f ‘ 1 * J f(Xl , • • • , Xn) dP(Xl) • ■ ■ dP(X n ) 

exist for every P in 3) and that for every point (xi , • • • , x„) with Xi e E,i = 1, • ■ ■ , 
n, the symmetrized function f [n] {x i , • • • , x n ) be equal to <p ln] (xi , • • ■ , x„). 



40 


PAUL H. HALMOS 


Peoof. If / is an unbiased estimate then f' n] is a symmetric unbiased esti- 
mate and therefore, by Theorem 3, equal to ^ |n) ; the converse follows from the 
facts that 

I ■■■ I f(x i , • ■ • , x a ) dP(xi) • • • dP(x n ) 

« J ... / f M (Xi , • * • , x n ) dP(Xi) • * * tlTUn) 

and that (as a consequence of the hypothesis £D £ U>(E)) the equality of /' and 
v> lnl for points whose coordinates are in E implies the equality of their integrals. 

Theorem 4 exhibits all possibilities for unbiased estimates (over domains satis- 
fying the hypotheses) . Given a point (an , • • • ,x„), suppose that the numWr of 
different points obtained from it by permutations of the coordinates is A r . (If 
the Xi are all different then N — «.!). An. unbiased estimate is obtained if / is 
defined arbitrarily over N — 1 of these points and if its value on the iVth point is 
chosen so that the identity 1 b satisfied. As long as the arbitrary 

choices at the (possibly) uncountably infinite point groups are nut too wild and 
not too large (i.e. are such that the resulting function/ is measurable and Integra- 
ble), / will indeed be an unbiased estimate. Typical nonpnthologieal examples 
of unsymmetric unbiased estimates are weighted averages of the permuted values 
of tp{xi , ,Xk), similar to the unweighted average ¥> [nl (*i , ■ • • , x„). 

6. Uniqueness. The assumption of symmetry is a rather natural one to require 
of an estimate: it amounts to requiring that the estimated value should be 
independent of the order in which the observations are made. Theorems 3 
and 4 establish that the concept of symmetry is inherently associated with un- 
biased estimation and that, under this assumption, there is a unique unbiased 
estimate (whenever there is one at all). These theorems, therefore, constitute 
a partial answer to the uniqueness problem (III) : symmetry, after all, is a possi- 
ble interpretation of “good” estimate. From another point of view the answer 
to the problem of "best” estimate is contained in the following theorem. 

Theorem 5. Under the hypotheses of Theorem 4, among all unbiased estimate# 
of 

F(P) - J ... j v ( Xl , • • • , **) dPfa) - ■ • dP{xt,) 

the symmetric, one, ^ 1 (api , • • * , x„) is the one with least variant Or, equivalently, 
the least second moment 

j ‘ > • ' ■ , *«)} 2 dP(xi) • • ■ dP(x n ). 

Proof. Observe first that if X 1 , ■ • ■ , X„ are independent random variables 
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with the same distribution P then, if / is an unbiased estimate of F(P), the 
variance of f(X, , • ■ ■ , Z„) is given by 

Since the second term is the same for all/, namely P 2 (P), minimizing the variance 
is indeed equivalent to minimizing 

E[f{Xi , ■ ■ ■ , Z„)} 2 = / • • ■ / {fix, x„)} 2 dP(x,) • • dP(x n ). 


This quantity need not be finite even for/’s and P’s for which E {f(X i , • ■ ■ , X n ) ) 
exists. It will be shown, however, to be minimized by <p ln] in the sense that 

£{? W (Xi, ••• ,XJj 2 < ElfiX,, ••• ,X „)) 2 

for all unbiased estimates / and all P, and that the inequality actually holds for 
some P. 

For the proof consider any unbiased estimate / of F. For any given point 
(x \ , ■ • • , x„) suppose that N is the number of different points obtained from it 
by permutations of the arguments, and denote by /, , i = 1, • • ■ , N, the values 
of / at these points Since, according to Theorem 4, f M = it follows that 

(,pln]) * -(j^-l/j ^ IrZhfi = (f) M , 

Hence 

/ ■'■/ • * - , m „)} 2 dP(Xi) • • ■ dP(x n ) 


< / ••• / {fix,, ••• , x n )} ln] dP(x,) dP(x n ) 

= f J fi x 1 > ■ • • ; x ») dP(Ti) < ■ • dP(x n ) . 

This already establishes the minimal property of f n] in the weak sense. 

If the inequality were an equality for all P for which the terms are defined then, 
by Lemma 2, it would follow that 

l* 1 " W •*,*.)}* = [f(x L , 

for all (xi , • • • , x n ). Hence the Schwarz inequality, as applied above to tho 

j 

sum 27-i /> i reduces to an equality ; this can happen if and only if (J , , • • ■ ,/v) 

is proportional to (^. , • ■ • , , i.e. if and only if all /, are equal to each other. 

The validity of this statement for every point is equivalent to the symmetry of / 
and hence, by Theorem 3, to the statement / = <f n \ This concludes the proof 
of Theorem 5. 
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7. Concluding remarks. (1) The most obvious estimates of the moments, 
F m (P), of a distribution about the origin are the sample moments 



Their use is justified by the uniqueness theorems (3, 4, and 5) of this paper. 
Similarly one might think that the natural estimates of the moments, P,»(P), 
about the expected value Fi(P}, are best estimated by the sample moments 

ffmixi 2?-l (Xt ~ $)* 

n 


about the sample mean, x = - 2 f-i Xi « Denote by/ m (xi the estimate 

of K(P) obtained by expanding P m (P) in terms of the F/(P), as in the proof of 
the corollaiy to Theorem 2, and then estimating each term by the symmetric 
estimate considered in Theorems 3 and 4. Then an easy calculation shows that 


h(xi ,•••,*■,)» ~~ -j g t (x i , • • • , a:,) 
and 


n 3 

f>(Xl ’ ' ' ’ ' = (JT— ~l)(n~— 2) 0i{xi > ' * ■ > 

(These functions are the classical estimates of P 3 and F, .) For > 3, f m can 
still be expressed in terms of p’s, but no longer as a constant multiple of g m > Ifc 
appears that in general /. is a linear combination of g x , - ■ . , g m with coefficients 
wbich are rational numbers whose denominators are (n - l)(n - 2) • ■ • (n - 

m . fact ® mother aspect of the non existence of unbiased estimates 

of order n for F m when m > n. 


(2) For any Borel set E on the real line denote by F S (P) the probability, P(E) t 

assigned by P to E. If <p a ( x ) is the characteristic function of the set E, the 
representation > c 


F*(P) = J <p t (x) dP(x) 

shows that F,(P) is homogeneous of degree 1, and therefore possesses unbiased 
estimates of all orders The symmetric unbiased estimate of ordTTte 
perfect accordance with intuitive demands, by the function/.^ , • . “ !) 7hZ 

value is - times the number of those coordinates which belong to E, 

(3) The situation in estimating such “non linear” functions as 

mew a paradoxical. In the first place it appears strange that there should be 
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essentially different processes for estimating the expected value and the square of 
the expected value. (Recall that since 

(Fi(P)) 2 = J J xix 2 dP(xi) dP(Xi), 

the symmetric unbiased estimate of (Fi(P)) 2 , of order n, is (ri^) 1 '*) Consider, 
for instance, the distribution P which assigns probability i to each of the points 
+ 1 and —1. The symmetric unbiased estimate of order 2 for Fi(P) is \{xi + 
Xi), and for (Pi(P)) 2 it is - Hence in the four possible cases 

( 1 , 1 ), ( 1 , - 1 ), (- 1 , 1 ), (- 1 , - 1 ) 

the biased, incorrect estimate {K^i + £ 2 ) for (Fi(P)) 2 yields 

1 , 0 , 0 , 1 , 

whereas the unbiased, correct estimate yields 

1, -1, -1, I- 

The actual value of (Fi(P)) J is, of course, 0. Hence it is true in this case that 
whenever the biased estimate is in error, the unbiased one errs by the same 
amount. To add insult to injury, the unbiased procedure even yields negative 
estimates for the essentially non negative quantity (Pi(P)) 2 . These considera- 
tions seem to indicate the necessity for caution in using unbiased estimates of 
"non linear” quantities, such for instance as P ,„(P). 



SOME SIGNIFICANCE TESTS BASED ON ORDER STATISTICS 
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1. Summary, In this paper significance tests are developed whose application 
requires only the determination of one order statistic and the computation of 
sums of sample values, The simplest case considered is that of testing a new 
sample value x on the basis of m previous sample values yt , • • * , y m , all sample 
values being assumed from normal populations with the same variance. Two 
separate tests of whether the mean of the new population from which a; waa taken 
exceeds the mean of the population from which yi , • • • , y * were drawn consist in 
accepting the alternative that the new population mean exceeds the old popula- 
tion mean if 


(1) 


( 2 ) 


(™" V — ') £ y> + Vm+7 ifa M- w > , 


where i/ (u) is the uth largest of yx , • • • , y m , It can be shown that both of these 
tests have the same power so that either one might be equally well selected for 
use. In practical application, however, there may exist reasons for preferring 
one test to the other. Similarly, the alternative that the new population mean 
is less than the old population mean will be accepted if 

( VWTT + i\ v / 

\ m J Zj Vi ~ Vm ■ f 1 


(3) 


x < 


(4) 


m 

x < ^ y^+r 


m 


A m 

-jT,y<+ Vm + 1 3 / 00 * 

An four of these significance tests have the same power, also the same significance 
level a(u, m). By appropriate choice of u and m thi significant lISwS 
made to assume values suitable for significance tests. FmTxample, ^ 

«(li 6) = .0156, a(2, 10) = ,0107 
a(3, 13) = .0110, a(4, ig) « >0 io7, 

ISr teStS are Stm Vahd if each of m ’ y- ^finals a sum of r sample 

t0 ca f ^ to a sum of r new 

sum of relatively weighted past sample vtues^miSh M ‘ 0thw 
statistic. The introduction of tHia Tc .i r i *? l , ^ut not as an order 
past information to be lumped toeeth er'Inl W61 ? bted 6Um allows toas reliable 
importance. ^ eether and wei ® hted wording to its relative 
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In comparing the order statistic tests with the most powerful tests which could 
be used for these alternatives it is found that the size of the samples used must he 
increased in order to bring the efficiency of the order statistic test up to that of 
the corresponding most powerful test. Thus the advisability of using the order 
statistic test will depend upon whether it is more desirable to take larger samples 
but have less computation. 

2. Introduction. Many statistical problems are concerned with the determi- 
nation of whether a new sample can be considered as having been drawn from the 
same population as that from which a previous sample was taken. Frequently 
this reduces to the question of whether the mean of the population from which 
the new sample came is greater than the mean of the past sample population. 
The problem of whether the new population mean is less than that of the old 
population is also occasionally investigated. If both populations can be con- 
sidered normal with the same variance, it is well known that the most powerful 
Studentized test of each of these one-sided alternatives is furnished by use of the 
appropriate Student (-test. When the number of previous sample values from 
which the test is determined is large, however, the computation of the numerical 
value required for the application of the Student (-test becomea lengthy. This 
calculation difficulty can become very important if the test is to be applied 
repeatedly as, for example, in quality control work. It is desirable, therefore, 
to develop other Studentized tests which are easily calculated and whose efficiency 
with relation to the corresponding Student (-tests is reasonably high. It is the 
purpose of this paper to develop tests of this type by the use of order statistics. 

The class of tests in which a new sample value x is tested on the basis of m 
previous sample values Vi y m used as order statistics is developed in detail. 
The significance tests arising are the ones given in the summary above. For a 
better intuitive understanding of what takes place rewrite (1) to (4) as 

(10 x - y > Vm + l(y - 2/do) 

(20 x + y > V m + l(y + y(m+i-u)) 

(30 x — y < Vm + l{y — 2/cm+i-u)) 

(40 x + y < Vm + l(?7 + Vm), 

where y is the average of the i/,- . The relative efficiencies of these tests with 
respect to the corresponding Student (-tests are determined and the simplicity 
of the computation necessary for their application is outlined. The method of 
attack having been sufficiently indicated by the development of this special 
class of tests, more general tests based on order statistics are stated but not proved 
here. 

3. Statement of the significance tests. Let each of x, y x } , y m be 

distributed independently of all the others, x according to N (v, a) and the y > , 
(i = 1, ■ • ■ , m), according to N(y, c), where the notation N(£, a) signifies the 
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normal distribution, with mean £ and variance /. As above let j/f U > denote the 
■uth largest of yi , ■ ■ ■ , y m . The one-sided significance teste an 1 then stated as 
follows: 

If 


(5) 


x > TT £ 2A - TP y<u) (A'j > 0} 

A2 i ila 

I > W E y. - VF ycm+l-u) (Xj < 0) 

zl2 l As 


accept the alternative n < v, otherwise accept the hypothesis tested, namely 
that fi = v. 

If 


( 6 ) 


x < 


x < 


K 2 
1 

Ki 



K x 

2/lm+l— u) 

Kx 

Kt Vw 


(K* > 0) 
(Ks < 0) 


accept v < n, otherwise accept v = g. 

The constants Kx and K 2 are given by 

(7) Kx = m + 1 =fc •s/m -f- 1, ifj = — 1 =f •s/m - f- 1, 

where all upper signs or all lower signs will be chosen so that to a given value of Kx 
there is but one value of K 2 This rule for the choice of signs will hold through- 
out the paper. 

It is to be noted that (5) defines two separate significance tests of the hypothesis 
a = r against the alternative n < v depending upon whether it is decided to use 
the positive or the negative value given for K t . A similar statement applies 
to the two significance tests defined by (6). 

Each of these four significance tests can be shown to have the same significance 
level, which is determined by the values of u and m. Denote this significance 
level by a(u, m ) . Then it can be demonstrated that 


«(1, «) = (*)" 


o(2, m) = (m + !)($)' 


“(3, m) - (m* + m + 2)(£) m+l , a (4, m ) = %(m 3 + 5m + G)(i)" ,+I . 

The general expression for a(u, m) is given by (12). 

It is to be observed that the application of these tests is independent of the 
parameters of the normal populations in question. 


46818 lre alm ° st lden ‘ ical wiih thit f ° r <** 
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Now consider this analysis, Let 
, % — v 


y< = 

cr 


(* - »)• 


Then a;' and the y( are independently distributed according to N(0, 1). Define 
r “ = 5Ti (m = 1, • ■ ■ , m). 


It is easily seen that 


S(r u ) = 0, P(r 2 ) = ^5 (X| + N? - 2K, + m), 


P(r u r 5 ) = ^2 (Kl - 2Zi + m), 


(w v). 


Thus the condition which must be fulfilled in order that the r u be independently 
distributed according to N( 0, 1) is that 

(8) Kl - 2Ki + m = 0. 

To insure that the r u are independent of fi when a = v it is evidently necessary 
that 

(9) Ki — m + Kt — 0. 

Solving (8) and (9) for Ki and K t one obtains (7). 

Restrict the r# by conditions (8) and (9) and let r^) be the wth largest of 
n , ■ ■ ■ , r m . From (8) K i > 0; therefore 

n«) = ^ - 2 

where y[ U ) is the wth largest of y[ , Then using (9), 


r(«) 


If A 

= vp Kiy^) — £ V> + Kix Nz(m ~ v ) • 
aw L i 


From the definition of the power function and (5) for Ki > 0, it follows that 
the power function for this test is given by 


1 » 

’Power Function = Pr x > £) yt - ~ y M 

A! 1 A 2 „ 


( 10 ) 


= Pr 0 < Kl y (U ) £ i/i + JKTaX < <® 

l 


Pr 


r 


(a — v) < — + K** + Njfa — < » 


' Ki cr 

= Pr [^ (M ~ >0 < r M < «]. 
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The distribution function of the order statistic r M may be found in [l], from 
which it follows that 

ml 

Power Function - 

(ID 


pas / \u~l / r 00 Vm u 

I ( f(v) dy) ( f(y)dy) f{z) tte. 


where 


*» - vs e ""- 


Consider the value of the power function under the assumption that the hypothe- 
sis is true. Then y. — v and from (1 1) the significance level of the test is given by 

7Ti 1 

( 12 ) ' 

( f* \u*~l / \lW u 

' i U dv ) \l, ^ dy ) * <z - dz ‘ 

The method used to eliminate a from the quantities required for the applicat it *ri 
of the significance test, therefore, is to have the limits 0 and « in the probability 
expression (10) for the power function when the hypothesis is true. Suitable 
significance levels are obtained by varying the statistical function r ( „. by means 
of the selection of the values of u and m. 


5. Comparison with Student /-test. The test consideied is that of a single 
sample value on the basis of m other sample values. Hence, the. corresponding 
Student t- test has m - 1 degrees of freedom. The probabilities of Type II 
errors for the Student /-tests are calculated for values of 


S = 


Id-* 
m 


by use of the normal approximation given in [2]. 
Using this notation 






i -b - 
n 


and from (11) the power function for the significance test for which the altcmativi 
ls v < v and K 2 > 0 is found to be 


(m 1) ! [m «)l ^(Xj/rov'i+uTm) u f(y) dy tv: m *)~ 


j{z) dz. 


The probability of a Type II error for a given value of 5 is eqU al to one minus 
the value of the power function for this value of 5. 
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It can be proved that the other three significance tests have the same proba- 
bilities of Type II errors as the one considered above. 

The numerical comparison of the two types of tests is contained in Table I. 
In each instance the significance level was chosen to be approximately 01. 

The process of increasing the size of each sample by a given percentage has 
practical meaning if each of x, ju , • • * , y m equals the sum of r sample values 
For example, if x, j/i , • * • , y m each consist of the sum of ten sample values, 
increasing the sample size by 30% would amount to letting x, yi , • • • , y m each 
equal the sum of thirteen sample values. The case where each of %, yi , • • • , y m 


TABLE I 


Test 

Degrees of 
Freedom 

m 

% Increase 
in Sample 
Size 

Signifi- 

cance 

Level 

Probability of Type II Error 

BS 

SB 

n 

n 

t 

5 


0 

mm 

.919 

.750 

.477 

.215 

O.S. 


6 

0 

m is 

.919 

.752 

.506 

.276 

O.S. 


6 

5 

.0156 

.916 

.742 

.486 

.256 

O.S. 


6 

10 

.0156 

914 

.732 

.469 

.239 

t 

9 


0 

.0107 

.930 

.735 

.413 

.142 

O.S. 


10 

0 

.0107 

.936 

.782 

.527 

.270 

O.S. 


10 

20 

.0107 

.927 

.738 

.448 

.191 

O.S. 


10 

30 

.0107 

.921 

.715 

.411 

.161 

t 

12 


0 

.0110 

.920 

.699 

.358 

.106 

O.S. 


13 

0 

.0110 

.933 

.771 

.492 

.245 

O.S. 


13 

30 

.0110 

.919 

717 

.378 

.139 

O.S. 


13 

40 

.0110 

.913 

.679 

.353 

.119 

t 

15 


0 

.0107 

.919 

.688 

337 

.092 

OS. 


16 

0 

.0107 

.938 

.765 

.488 

.234 

OS. 


16 

40 

.0107 

.917 

.687 

.351 

.111 

O.S. 


16 

50 

,0107 

.912 

.664 

.310 

.090 


equals the sum of r sample values will be treated later and will be shown to be 
a particular case of the one analyzed above. 

In Table I the order statistic tests (O.S.) are calculated for cases where the 
size of each sample is increased by the same percentage. This amounts to saying 
that the amount of information used for the test has been increased by this 
percentage. This method furnishes a quantitative estimate of the relative 
efficiency of the order statistic test as compared with the corresponding Student 
i-test. For example, if 30% more information is required for the order statistic 
test to have the same probabilities of Type II errors as the corresponding Student 











50 


JOHN E. WALSH 


t-test, then the order statistic test will be said to have a relative efficiency of 



Examination of Table I shows that the order statistic teats have the approxi- 
mate relative efficiencies listed in Table II. These relative efficiencies can be 
shown to be approximately the same as those for other significance levels. 


6. Computation required. Since application of the order statistic test requires 
only the determination of one order statistic, the calculation of one sum, the 
multiplication of each of these quantities by given constants and the subtraction 
of the resulting values, the amount of computation required for application of the 
order statistic test is obviously much less than is necessary for the application of 
the corresponding Student t-test 

If the test is applied continuously from one sample to the next, as in quality 
control work, the value of ^ iq can be calculated by a continuous process. For 
let the sample values be taken in the order j/i , • • • , y m , x, where x is the new 


TABLE II 


m 

Significance Level 

% Increase in Sam- 
ple She 

Relative Efficiency 

6 

.0156 

5 

95% 

10 

.0107 

25 

80% 

13 

.0110 

35 

74% 

16 

.0107 

43 

70% 


sample value which is to be tested on the basis of the previous m sample values 
Vt ’ ’ Vn ' Then x for the Present test becomes y n for the next test; y m be- 
comes l/m-i ; • ■ ■ ; 2/2 becomes j/i , and y x for the present test is no longer used. 

The value of x will be furnished by the next sample value drawn. Thus, ^ y ( 


for the next test is calculated by adding x - y x for the present test to £ y ( for 

samS Tf te6t ' 0r ? er 8tatistic caQ 56 easU y determined from a plot of the 

ample values which is also applied continuously from one sample to the next. 


7. Generalization of results. The derivations given above are immediate 
applicable to the case where x reDresentn tu JL V . j mme(llft « !l y 
lation with distribution W, Tand each l T* ^ t P ° PU * 

of r sample values from a population Then T! 

be distributed. aernrHintf 'Ml* 1 « i , , } ** ihctl 3? would 

to N(ru' r<r' 2 ) The<w» Hi f ’k +■ ’ ^ &n< ^ V < "would be distributed according 

M - 1 27 “ utlons are of the f0 ™ ^ *r<* A wC 
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If x equals the sum of r sample values from a population with distribution 
N(v, a 2 ) and each yt , (i = 1, - • * , m), equals the sum of s sample values from a 
population with distribution N(n, a), the significance tests are derived in a 
similar manner and can still be stated in the forms (5) and (6), but the values of 
K\ and Kt become 

Xi-w + jji, |/j(m + 5), K 2 = 


The power function for the test in which the alternative is n < v and Kt > 0 
K% K a / - " 

is found by replacing ~ — (p — v) by — - (p — v) in (11). The significance 

Ai<r Ki <r 


level of each of the four tests is again furnished by (12) and it can be shown that 
each test has the same power. 

To this point all significance tests considered have consisted of testing a new 
sample on the basis of m previous samples used as order statistics. In some 
cases, however, it may be desirable to utilize additional samples in the test but 
not as order statistics. These sample values can be gathered together in a 
summation term in which values from different samples are given relative 
numerical weighting. This procedure can be used to emphasize those sample 
values which appear to be more important from practical consideration with 
relation to those which seem to have less importance. The determination of 
what relative weighting scheme to use is to be decided by the person applying 
the test and is not considered as a problem of this paper. The significance 
tests with this property can be stated as follows: 

Let each of x a , yn> , Zj e , (a = 1, • • • , r; b — 1, ■ • • , s; c = 1, i = 

1, • • • , m; j = 1, • • ■ , n), be distributed independently of all the others, the x„ 
according to N(v, a) and the and z ja according to lV(p, o' 2 ). Define y u — 


2 2/«& } ( u ~ 1) 1 • ' i m )> and let y (u) be the nth largest of y\ , • • ■ , y m . 


4-1 

one-sided significance tests are then given by 


The 


If 



— Vr 
K % 


"V "m-H— 1 u 


(Kt > 0) 



(Kt < 0 ) 


accept the alternative p < v, otherwise accept p = v. 

If 

±x a < V u (Kt > 0) 

1 A 2 



-V7 

k 2 


v 


m-f-l—u 


(Kt < 0), 


accept p > v, otherwise accept p = v. 
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The quantity F u is given by 



where the constants C, , (j = 1, ■ ■ ■ , »), are defined by C { = w#, the w, king 
given positive weights. The values of j), Ki and if? are 


- kf- 

_ b y s 


K * 


K, 


^ m + A'/B’ 
n + A- IB 


rr Wl ( , A 1 | , ft v \ 

k '‘WTWb\ m + 'b + V"° 


Him + 




£(w-M 4 /B)* + A* 


where 


1 £ 


#'=L W > 


The quantity y in the expressions for the Cj is not considered given but is de- 
termined in the derivation of the tests. The two equations corresponding to 
(8) and (9) then contain three undetermined quantities r?, Ki and Ki , Thus 
there are infinitely many possible selections of these quantities, each selection 
resulting in a valid significance test. The values of rj, K\ and K% given above, 
however, are the ones which result in the maximum power function and conse- 
quently the smallest probabilities of Type II errors. The power function for tin; 

Ki 

test in which the alternative is g < v and Kt > 0 is that given in (11) with T >"' 

tr 


■ (a - v) replaced by 


KtVr 

Ki<r 


(a - v). 


The significance level of each of the four 


tests given above is still that of (12) . It can also be shown that each of the tests 
has the same probabilities of Type II errors. 
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CHAINS OF RARE EVENTS 

By Felix Cernuschi 1 and Louis Castagnetto 
Harvard University 

1. Summary. The negative binomial distribution of Greenwood and Yule is 
generalized and modified in order to obtain distribution curves which could be 
used in many concrete cases of chains of rare events. Assuming that the num- 
bers of single, double, triple, and so on, events are distributed according to Pois- 
son’s law with parameters Xi , X 2 , X t ■ • • respectively, and that X, is given by 

1 

X. = Xi — , the probability of obtaining M successful events is studied. In the 

considered relation X, , for convenient values of a, first increases with s and after 
a certain saturation value of s starts to decrease. A relation of this type is very 
suitable for studying the distribution of score in a match between two first class 
billiard players, the probability of accidents on a highway of dense traffic, etc. 
The general methods of finding the distribution curves for arbitrary relations 
between the X’s are indicated. The method of steepest descent is applied to find 
an acceptable approximation of the distribution function; and the advantage of 
this method'is pointed out for other similar cases, in addition to the concrete one 
which was developed, in which the method of direct expansion into power series 
becomes inapplicable. 

2. Introduction. M. Greenwood and G. U. Yule [1] have deduced the nega- 
tive binomial distribution from a compound Poisson law: 

PK 

where X itself is a random variable distributed according to Pearson’s law of 
type III: 

P(X)dX = p a+1 -, 
a! 

They obtained the distribution 

P(m) = (1 - q)“ +1 Q *, 

* a! ml 

3 

where 1 — a = — — - . As is easily seen, P(m) is given by the coefficient 

P T 1 

of x m in the expansion of: 
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R. Ltidere [2] has arrived at a negative binomial law by the following considera- 
tions. Certain events, like automobile accidents, can be classified as simple or 
multiple according to the number of units involved. Assume that the numbers of 
single, double, triple, and so on, events are distributed according to Poisson’s law 
with the parameters , Xi , X a , • • • , respectively. The probability of obtaining 
fti single, rh double, n t triple , ■ • • successful events is (assuming mutual inde- 
pendence) 


( 1 ) 


P(m , «2 , «3 > * ■ • ; Xi , Xs , X* , • • • ) 


XT XT 

nil njl 




The total number of successful events is 


(2) n = ni + 2ns + 3n s -f- • • • -f ins + * ■ • . 

The probability of obtaining n successful events is given by the sum of all expres- 
sions (1) subject to the condition (2). This sum is given by the coefficient of 
a:" in the expansion 

(3) f(x) ~ e ~ (X ‘ +Xl+ ' 0 e xi*+x‘*»+- i 
Now if the parameters X, satisfy 


(4) 

one finds 

(5) 

and 

( 6 ) 




X. ® Xx 


/(*) 


\1 - ax) 




Taking — equal to a +■ 1 one gets Greenwood and Yule’s distribution in the 

form given above [3]. The negative binomial law has useful applications, for 
instance in some cases of accidents of workers in factories. It is proved that 
with values of a near 1, the most probable value for n is n » 0 and the average 
value is a finite number different from zero. Therefore the distribution will lie 
in some way similar to the distribution of the scores in a match between two first 
class billiard players whose most frequent scores are zero and their average may 
be, say, 50. In the case of the Poisson distribution the most frequent score and 
the average score should be nearly the same. The relation (4) does not provide 
an adequate description of many practical distributions. For instance in a 
match between two first class billiard players, the probability of making a second 
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third, • • ■ , point will be considerably greater than the probability of making 
the first. With the relation (4) X, is a decreasing function of s, while we shall 
investigate cases in which X, first increases with s and after a certain value of s 
starts to decrease. As other examples of distributions of similar types we shall 
mention the following: On a highway with dense traffic at high speeds the prob- 
ability of only one car being involved in an accident may be smaller than the 
probability of having several cars involved. Something similar may be Baid for 
the cases of work accidents in factories where the work of one is interconnected 
with the work of others. In many cases of telephone calls (business transactions, 
organization of meetings, etc.) the sample Poisson law is not suitable to interpret 
the distribution of calls, since one call may increase the probability that the 
called person makes one or more calls. 

The purpose of this paper is to treat the problem when, instead of (4), we take 
other expressions which may in a better way describe some processes such as the 
ones which we have referred to. 


3. Modification and generalization of the scheme of Greenwood-Yule and 
Liiders. According to the relation (4) X, is a decreasing function of s and the 
parameter a must be in the interval 0 < o < 1. Instead of (4) we shall use 

(7) X. -X^, 

where a may have any positive value. In particular for a = 0 our case reduces 
to the Poisson case. 

From (7) it follows that 

/o\ ^«+i _ o 

(8) X - i+i 

and we see that X. increases with s for 1 < s < a and decreases for s + 1 > a. 
Substituting from (7) in (3) we get 

(9) /(*) = e"' Xl/o: “ 

Ab the probability of obtaining n successful events is given by the coefficient 
of a:" in (9), we shall expand e acf ’ in power series (a, /3 being two arbitrary 
constants). We have 

( 10 ) ,r , ’-i + ££s*-f+±£s±!£f:. 

n~l ft I h- 1 ttl m-1 7M 


V 7H m / \ a 

m«=i tn: 
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where [4] 

y t (a) = a 
Vtia) = « + « 

(12) 2/a (a) = « 2 + 3a 2 + a 

l v y> A' 0 n { 

V»W = iL, — - a . 

Here we use the notation of differences of zero : A'O". We have 


Now in our case 


whence 


L *=i «! J 

««<>* nf. , r 23" „ tp A‘ 0" <"] 

e = e 1 + Z-i -T * 2-, — rr- « . 
L n*»i nl i»»l 'll J 

x t 

“ =■ — ) ~ a, 

a 

p( n ) _ e -(Xi/a><««— l) V" A‘0" / XA 1 

nl£f il Va/’ 

P(0) = g -( ^i 


(17) P(0) = g-* 1 ™'- 0 , for 

We have in particular 

P(l) =XjP(0) 

p (2) = 2| a ^i)7 J (0) 

p (3) = 3 ] ( x * + 3Xic + Xia 2 )P( 0 ) 

P(4) = li (A * + 6X >° + 7 ^ 2 + W)P(O) 

p (5) « gj (Xt + 10A?a + 25X 8 a 2 + 15A?a 3 + Xia*)/ J (0) 

P ® = 6*1 + 15X ' a + 65x ‘° 2 + 9 0Xja J + 31Xio* + Xia')P(O) 

5 p ( ? ) * Y\ (X ‘ + 2lx ° a + i40x ^ + 350Xia 8 + 301XV + 63xV 


n > 0. 
for n « 0, 


+ W)P(O) 


P(8) - gI (Xi + 28Ala + 266Xio 2 + 1050\?a 3 + 1701X a V + 9CCX!a‘ 


+ 127X>a e + Xj a’)P(O) 



CHAINS or HARE EVENTS 


57 


P(9) = (X? + 3 6X?a + 462 xla 2 + 2G46XiV + 6951X?a + 7770x1a 6 

+ 3025XV + 255X*o 7 + X s a 8 )P(0) 
P(10) = “ (Xi° + 45X?a + 750Xfa s + 5880XV + 28827X?a 4 + 42525XBa 

+ 34105Xia‘ + 9330Xia 7 + 511X?a 8 + Xia 5 )P(0). 

For Xi = a it follows that 

(19) P(0) = e _ ** +1 

(20) P(») = e _ *“ +l ~ y n (l) 
iPi.n)=ee- 

n-0 L n-I n! J 

Particular values of (20) are 

P(l) = oP(0) 

P( 2) - a J P(0) 

P(3) = ^ P(0) 

P(4) = ~ P(0) 

41 

P( 5) = ~ P(0) 


( 21 ) 


P(6) = 


203a 6 

61 


P(0) 


P( 7) = ^yp P(0) 
P(8) = '^- 8 P(0) 


P(9) 


81 

21147a 5 

91 


P(0) 


P(10) - P(0). 


In Figure 1 we have graphed the curves P(n) for the values — = 1; X* — 0.1, 

CL 

Xi = 1, Xi = 2. We see in particular, that for Xi = 1 we have P(0) = P(l) 
and for Xi = a = 1 we have P(0) = P(l) = P(2). 
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4 . Application of the method of steepest descent. If X. a not given by (7) 
the above method of direct expansion of f{x) into a power series, usually becomes 



Fio. 1 Distribution Curves for - = 1, o = 0.1j — -1, o»l-— «1 o«2 

a a ’ft 

inapplicable, In many cases it is possible to use instead the method of steepest 
descent [5] in order to obtain approximate values for the coefficients of x " in the 
relation (3). 
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As is well known, if f(z) is an analytical function we have 


( 22 ) 


coeff. of z” = dz=^j e x{z ' 


2tir J z 


v)+<y(x,v> 


dz 


m 


where X + i Y = log -s+i and the integral is taken along any closed path around 
the origin. 

To evaluate the integral (22) we shall follow a method similar to the one used 
by R, H. Fowler [6]. Putting z = pe‘“ the relation (22) may be written: 

(23) Coeff. of ,■= ia 

where the value of p is arbitrary. We shall put in particular p = x 0 where Xo is 
the root of 


(24) 


For most functions which interest us 


xaf(x a) 
/(*#) 

/(*) 


n. 


x n 


» as a: — » 0 and as x — *• K (a positive 


number which in some cases may be infinite) and the second derivative is always 
positive. Consequently /(x)/x n has only one minimum between 0 and K, and (24) 

f(xne' a ) 

has therefore only one root xo . Developing log \ {n - into powers of a, (24) 


becomes 

(25) coeff. of x” 
where 


JL M 

2tt xo 


l: 


x 0 e 




doe, 


V ( X ) =] 0g M, 


In the case where <p" (x 0 ) — » 1 the first term in the exponent in (25) increases 

A 

in absolute value very rapidly in the neighborhood of x 0 . For small values of a 
we may therefore in a first approximation drop all other terms. Also, as this 
first term tends rapidly towards zero one does not appreciably increase the error 
by replacing the integral from — ir to 4* r by the integral from — ■» to + « . 
In such cases we have, therefore, the approximate formula 


(26) 


coeff. of z n 


£(gg) 
2 in x" 





/(» o) 

Xo +1 V / 2 *y>"(Xa) 


We are now in a position to deduce asymptotic values for the probabilities P(n) 
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which we have previously calculated directly. In. fact, for f(x) defined by (9) 
we obtain from (26) for large n 


( 27 ) 


V2 n x£-\/n{axo+ 1) ’ 


where x 0 is given by 


= . n _ 

XlKo' 

In particular for Xi = a and putting axa — l/o it follows that 

- <f* Xl -===== . 

l/o/ Vn(y o + 1) 

Comparing the numerical values given by the relation (28) with the exact values 
we find that even for n = 4 and Xi = 1 (28) gives an approximation with an error 
of about 5%. 

Formula 2 (26) can also be used to evaluate the numbers y„(l) defined by (12) 
for a = 1. Relation (13) gives for a = /9 = 1 


e «* = 

s=i n\ J 


and therefore 


Coeff. of x n in expansion of e** = 


ey n (i) 

nl 


Putting/^) = e' and using Stirling’s formula for n ! we have from (26) 

e n [*» + ir( 1+ s)] 


y*(D 


Vxo + i 


a Applying thia relation to f(z) = e‘ one obtains immediately Stirling’s Formula • 

, , /(*) , 

<f\z) = log — = z ~ n log z 


<e’(z) = 1 - 


Xo «=■ 71. 


/U) - 

nl 


ft f \ % 0 ft 

' w v - j ' 

e \” 1 

n ) 


Also relation (26) is useful to find other symptotic expressions; c g, for f(z) ® (vs + 9)" one 
obtains for n — * as the Laplace-Gauss formula. 
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where $0 is given by 


e 


*0 ^ 


n 

io ’ 


For n = 4, Xo = 1.202 and j/ 4 (l) ^ 15.56. Ab the exact value of j/ 4 (l) w 15 we 
obtain in this case an error of less than 4%. 

Repeating the calculations for n = 6, x t *= 1.432, we find that j/«(l) is given 
with an error of less than 3%. 
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NOTES 


This section is devoted to brief research and expository articles, notes on- 
methodology and other short items. 


A NOTE ON SOME SINGLE SAMPLING PLANS REQUIRING THE 
INSPECTION OF A SMALL NUMBER OF ITEMS 


By J. H. Curtiss 
Cornell University 1 


In the practical application of sampling inspection plana it is often necessary 
to restrict the number of items (pieces, samples) inspected from each inspection 
lot to a relatively small number. For example, if many vendors are supplying 
a manufacturer with small lots of various kinds of material, the manufacturer 
will usually wish to have some check on his suppliers; however, he cannot afford 
to inspect large numbers of items from each lot. If sampling plans requiring 
the inspection of a small number of items are used, it is advantageous to know 
the characteristics of such plans. The present note offers several single sampling 
plans with sample size n < 25, together with their operating characteristic 
curves (OC curves) and average outgoing quality curves (AOQ curves). 1 

Single sampling plans for large lots may be described by the runnier n of items 
to be inspected, and the rejection number r. If r or more of the items inspected 
fail to meet some predetermined standard the lot is rejected; if leas than r items 
fail to meet the standard the lot is accepted. 

The OC curve (see Figures 1, 1A, 3 and 5) shows the relationship between the 
probability of rejecting a lot and the true quality of the lot. The quality of the 
lot is often measured by the “percent defective" in the lot; i.e., the proportion of 
material which does not meet some predetermined standard. It should be noted 
that the definition of OC curve given here is only one of several in common use. 
In particular, the vertical axis often giveB the probability of “acceptance"; such 
a treatment would amount to an “inversion" of the curves given here. Another 


'The material in this note was originally prepared as an office memorandum for the use 
of engineering technical personnel in a Government Bureau. The author wishes to 
is appreciation to Mr. C. F, Mostellerfor extensive editorial work on the original memo ran- 
"^ h “„ b c lU ,™ » * "»W» -«• tor publication in ttj 

isnot customarvt *7 ofter \ ade ^ at '’ to analyte single sampling plans because it 

ance or S IT P eVen when the outcome of tha 1 Wticm (accept- 

ance or rejection) is determined before all the items are inspected. In other kinds of cam 

samplfcur tr f0 r e ’trSrir d » <*«*> used after the first 

T pl,ng ftl8 ° u 8 e f ul - Howew ^ if «• »■ 

inspeetinghis own product mmht he P eotlon > including detailing, as a manufacturer 
"»« » tho.ra. 1 . .moral o, to 
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common form would have the “percentage of presented lots (of quality indicated 
on the horizontal axis) that will be rejected (accepted)” as its vertical scale. 



Figure 1 


It has been assumed that the lots are bo large that the samples can be regarded 
as being drawn from an infinite population, or to put it another way, that there 
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is no error in treating the samples as if they had been randomly drawn "with 
replacement”. 



Fiqubb 1A 

Especial interest is often attached to the points where the curve crosses the 
5% and the 90% probability levels. A rejection probability of 5% is frequently 
associated with a quality value that has been called the "acceptable quality level” 
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(abbreviated AQL), and m published sampling tables by Dodge and Romig, 3 a 
rejection probability of 90% is associated with a quality value which they call 
the “tolerance percent defective. 1 ” 

The average outgoing quality curve (AOQ curve, see Figures 2, 4 and 6) of a 
sampling plan shows the relationship between the long run average quality of 
the outgoing product after sampling inspection and the quality of the product as 
submitted for inspection. The quality of the product in each case is usually 
measured by the “percent defective” in the product. 

SUPPLEMENT TO FIGURES 1 AND 1A. 

Quality of Lot ( measured in percent defective) corresponding to various probabilities 
of rejection, for sampling plans in which a lot is to be rejected if one or more 
defective items are found in a set of n random sample items 



Probability of Rejection 


.01 

05 

.25 

.50 

75 

.90 


percent 

percent 

percent 

percent 

percent 

percent 

1 

01.00 


25.00 

50.00 

ms 

90.00 

2 

00.50 

02.53 

13.40 

29.29 


68.38 

3 

00.34 


09.14 

20.63 

MB 

53.58 

4 

00.25 

01.28 

06.94 

15.91 

29.29 

43.77 

5 

00.20 

Imwm 

05.59 

12.95 

24.21 

36.90 

6 

00.17 


04.68 

10 91 


31.87 

7 

00.14 

■gv 

04.03 

09 43 

17.97 

28.03 

8 

00.12 


03.53 

08.30 

15.91 

25.01 

9 

00.11 


03.14 

07.41 

14.28 

22.57 

10 

00.10 


02.84 

06.70 

12 95 

20.57 

11 

00.09 

00 47 

02 58 

06.11 

11.84 

20.40 

12 

00.08 

BI9 

02.37 

05.61 

■ 

17 .46 

14 

00 07 

m 

02.03 

04.83 


15.17 

16 

00 06 

00.32 

01.78 

04 24 

08.30 

13,40 

20 

00.05 

00 26 

01.43 

03 41 

06.70 

10.88 


The average outgoing quality is dependent upon the treatment of rejected lots. 
If rejected lots are cast aside once and for all, and are never resubmitted with all 
deficiencies corrected, then the average quality of the outgoing product after 
the sampling inspection tends to be the same as the average quality of the product 
submitted for inspection (provided that the quality of individual lots does not 
fluctuate too wildly). The only direct effect that the sampling inspection has 
in this case is to reduce the amount of the product which is accepted. However, 

3 H. E, Dodge and H. G. Romig, Sampling Inspection Tables, Single and Double Sam- 
pling, John Wiley and Sons, Inc , New York, 1944 
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the situation is very different if a rejected lot is always resubmitted with all de- 
fective material removed or replaced with non-defective material. In this case, 



Figure 2 


0t .? e 0 " t8 ™* produc * ai “ r the sampling inspection wiU 
tod to be better than the average quality of the product submitted for topee- 
t,on. In tot, rf the submitted quality is very poor, the average outgoing quaMy 
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will theoretically tend to be very good, because so many of the lots are rejected 
and then detailed. 



Under the assumption that each rejected lot will be detailed and resubmitted 
with all deficiencies corrected, a typical average outgoing quality curve starts 
at the origin, rises rapidly to a maximum, and falls off more slowly. The maxi- 
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mum average outgoing quality is called the average outgoing quality limit 
(AOQL) of the plan. 



Figure 4 


The graphs give the operating characteristic curves and average outgoing 
quality curves of certain single sampling plans. It is assumed the samples arc 
taken at random without replacements from a lot which contains at least 10 times 
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the specified number of samples. In the case of the average outgoing quality 
curves, it is further assumed that rejected lots are always detailed and resub- 



mitted with all the defective material replaced by non-defective material. An 
approximation has been made in the calculation of the AOQ curves which makes 
them upper bounds. If it is assumed that many lots of size N of exactly the 
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same quality of product p are being produced and that we are taking samples of 
size n from them, then it follows that .A.OQ " p Pa (1 n/i\ )j whom Pa is the 
probability of accepting a lot. The term n/N has been omitted; therefore these 



AOQ curves are too high, but are a good approximation provided only that the 
ratio of sample size to lot size is small. The condition mentioned earlier in this 
paragraph requires that n/N < 0.1. 


ON THE USE OF THE SAMPLE RANGE IN AN ANALOGUE 
OF STUDENT’S f-TEST 

By Joseph F. Daly 
Bureau of Ships, Navy Department 

Let Xi, • • ■ , x N represent independent observations on a variate x which is 
normally distributed with, mean y and variance <r 2 . Assuming no prior informa- 
tion about the value of either parameter, let H 0 be the hypothesis that y is equal 
to or less than a specified quantity yo . The classical test of this asymmetrical 
form of “Student’s” hypothesis [1] is based upon the statistic 

‘ = W(* - , 

the region of rejection being defined by the relation t > i, . 

For certain applications of a routine nature, however, such as production line 
inspection, the usefulness of this test is rather seriously impaired by the arith- 
metical work involved in the computation of t. For this reason Dodge [2] and 
Knudsen [3] among others have proposed tests of H 0 based on a statistic of the 
form 



w 


where w is the sample range. It is the object of this note to show how the 
probability distribution of G can be obtained with the aid of the distribution law 
of w tabulated by Pearson and Hartley [4], and to present some numerical results 
which indicate that the power of the resulting test is the same for all practical 
purposes as that of “Student’s” <-test for sample sizes N < 10. 

The calculation of the percent points of the G distribution is greatly facilitated 
by the following result, which does not appear to be generally known: 

Lemma: If x and w represent respectively the average and the range of a sample 
of N independent observations on a normally distributed variate x, then x and id are 
statistically independent. 

Pit oor: No generality is lost by putting y = 0, v = 1. The joint character- 
istic function of x and the %N{N — 1) differences x/ — x* , (j < k), is then 

*<(,<») - (2 .)-'*»’> f“ ’ &,■■■*„ 


where the summation runs from 1 to N on each index with the understanding that 
tjk = 0 for j > k. The usual process of completing the square in the exponent 
then yields 


f>(t, tjk) 




(2tt) 


-W2) £ 


-Is {*.-.[1+20, *-<*<)])’ 


dx i ■ • ■ dx N . 
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-!(*+*»)» - 
/ e dx — l e 

J— co *'-« 


this reduces to 


¥>(i» £>*) - 6 




which readily factors into 


-(tVsx> V 

= e ■ e i L ‘ J 


Pi(t) ■ = e ■ e i'-* 

Hence the differences Xj — Xh are jointly independent of .F; and since the range 
w is a Borel measurable function of these differences (i,e., w = max | x, — x>- () 
it follows that x and w are independently distributed. 

The foregoing lemma is in fact capable of further generalization as follows; 
Let g(x i , • • ■ , Xu) be a function which, like the range, has the properly that 
g(xi + a, ■ • , x N + a) = g(%\ , ■ , x N ). The characteristic function of J and g 

can then be written in the form 

e(t,h) = e -<‘V^. ( 27 r )-W’) re-'zw^’+'^dx^-.dxy - 

J— BO 

Now if the second factor f is analytic in t, it must be a constant as far as varia- 
tion with l is concerned; for by putting t = iNa (a real) we have 

J— BO 

= (2*r WJ > r . . . drx 

j— CO 

= (2ir)- W2) £ d 8l . . . Jz„ » ^(X). 

Therefore A), being constant in t along the axis of imaginaries, must lx; free 
of i throughout the complex plane. The joint characteristic function of £ and 
g is thus equal to the product of their respective characteristic functions, so that 
the two variates are independently distributed. In particular this result shows 
that in the normal case each of the moments about the sample mean is distributed 
independently of x 

Returning noiv to the distribution of Q, we see that for (r, > 0 


> j ~ p \ ~VngT- > lu/o 

f“ r'VSo, 

= J 0 J „ f(*)h 


f(z)h(w)dw dz 


= l f(z)PWVNG,)dz 

where /(e) is the normal probabUity function for y = 0, <r J = 1 an d P(u ) is 
the value [4] of the probability that the range of a sample of N oblations 
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will be less than u standard units. For selected values of N Table I gives the 
value Got such that 

P w {(£ — fio)/w > G os | n = Mo} = -05. 

TABLE I 


Upper 5% points for distribution of G 


N 

G os 

3 

.88 

5 

.39 

7 

.26 

10 

.19 


These values were calculated by Simpson’s rule and checked by Weddle's rule. 

To evaluate the probability that G will exceed G, when p ^ p a we may write, 
following Johnson and Welch [5] 

x — p a _ a/W (x — p)/o + \/N(p — mo) A _ g + a 
w -\/Nw/<t y/Nw/a' 

The required probability is then given by the integral 

LMimh a ■ VS(, ‘ - ,i)/ ° ■ 

Table II is a comparison of the probability that G will exceed Gob with the 
corresponding probability that “Student’s” t will exceed t 0 b foi various values of 
(ju — po)/<r, the case N — 3 being chosen because the non-central t distribution 
is formally integrable in this case. 

TABLE II 


Probability of rejection for G and for t, ( N = 3) 


(m — Po)/<r 

P{G > .88} 

P{t > 2.92} 

.00 

.050 


.50 

.151 

.151 

.75 

.229 

.230 

1.00 

.322 

.322 


Similarly for N = 10 it was found that when p — = .383cr (i.e., when a = 

1.21) the probability that G will exceed G ob is .296; the corresponding probability 
for t is given by Neyman and Tokarska [1] as .30. 

Pending the construction of more adequate tables of the percent points of the 
G distribution, it seems worthy of note that for N < 10 the values of G M can 
be estimated quite accurately by multiplying the corresponding upper percent 
point t os by the factor 
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J . /z(i - *T*‘ 
b- W y-TJ 
VmeM 

where EH is obtainable from Tippett’s table of the mean range (fl). Estimated 
values of G M for sample sizes from 3 to 10 are listed for convenience in Table HI. 
The approximate values of G,c& proposed by Knudsen [3] were calculated in 
essentially this fashion, using however the s quare root of the expected value of 
2 (a - if instead of the expected value of V2(a: ~ if, and employing percent 
points of the t distribution determined by the relation P\\t\ > t.«) «» .05 
instead of P{t > t. 0 s) = .05. Thus though the agreement between the values 
listed in Table III and the corresponding computed values shown in Table I 
is extremely good, the discrepancy between these values and those given by 
Knudsen is rather large. Any error committed by using Knudaen's table will, 

TABLE III 


Estimated upper 6%> points for distribution of G 


N 

G.« 

3 

.882 

4 

.520 

5 

.385 

6 

.309 

7 

.260 

8 

,227 

9 

.202 

10 

.183 


however, be on the conservative side, in the sense that the probability of un- 
justly rejecting H, will have somewhat less than half the value indicated in that 
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AN INEQUALITY FOR DEVIATIONS FROM MEDIANS 

By John- W. Tukey 
Princeton University 

In a recent note in these Annals , Bimbaum and Zuckerman [1] proved that if : 

(1) Xi , Xt , • • • , X„ are independent random variables with the same 
distribution (ie., form a sample), 

(2) their common distribution is symmetric about zero, 

then 


E(\ + Xt + ■ • • + X„ |) > <p(n) • E{\ Xi |), 


where 


<p(2/c 1) = <p(2k 2) — 


1 - 3 - 5-7 ■ 
1 - 2 - 4 - 


■ ■ ■ (24 + 1 ) 
6 • ■ • (2k) 


It is the purpose of the present note to extend this to the following, more 
general, result: 

Theorem. If 


(i) X\ , X 2 , • • ■ , X n are independent random variables, 

(ii) the median of each Xi is zero, 


then 


E(\X t + X 2 + ••• + Xn I) tf(|Xl] + \X,\ + ••• + | X n |) 

Tl 

It will be convenient to let dt = E(\ Xi |) and 

<5 = ^(|Zx| + \Xt\ + + \X n \), 

71 71 

so that the desired inequality becomes 

■®(l + Xi, + + X n ]) > <p(ri)-&. 

Define e< by 

= f xdF,(x ) , 

J o 

where F,(x) is the cumulative distribution function of X< . Since 
d , = E( | Xi |) = — xdF t (x) + J xdF { (x) , 
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it follows that 



•" d( . 


The basic idea of the proof, which is common to both. notes, is to divide the 
n-dimensional space of xi , ty, , x n into its 2" “octants," break up the 
expectation of | Xi + Xi + • ■ • + X„ | into the corresponding parts, and apply 
elementary inequalities. Let O a be the octant in which a set S of variable.8 
are < 0. From (4), (5) and hypothesis (ii) it follows that 


2 


n-1 


XiUdFfa) = 
O s J 


\ e ‘> 

- d, , 


if it > 0 
if it < 0 


in Og , 
in Og . 


Hence 


2” 1 /"*•■/* 2 x t II dFj{xj) = Z e t — Z di = e — J2 dt . 

0 B 

where e — X e,- , and the second and third sums are over all dt for which x< < 0 
in the chosen octant O s . The contribution of the octant O s to i?(| Xi + Yj 4* 
•• X n I) is 

/' ' • / I S I II dF,(xj) > I J f (Z x<) II dFj(xj) 

Os Os 


-sr^l e-Etdl 

M j 

For each value of s, there will be (:) octants with a variables £ 0, The sum 
of their contribution to E(\ X x -)- Z 2 • X n |) is 


where the inequality follows from 2 1 o, | > 1 2„, | , and it to noticed that each 
d, occurs _ j J different inner sums. Recalling that Si, = ng, this may 



I 6 - *3 I . 


be written 
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Finally, 

£?(] X l - X 8 + ■ • • + X„ |) = £ I, > 2- (n -’> £ ( n ) \e-sd\ 

J =0 »=0 \ 8 / 

> 2 _( ' 1_1) Z ( n ^ {| e - sd | -f | e - (n - s)d\] 

2a <n \ 8/ 

> 2 -(n_1) E ( i)(n-2s)d , 

2a <n \ S/ 

where the last inequality follows from | a | -f- j b | > b — a. To complete the 
proof, it is only necessary to evaluate the last sum One method of evaluation 
may be found m Birnbaum and Zuckerman’s note. 

If each X, — ±1, each with probability one-half, then all of the inequalities 
of the proof become equalities. So that, in this case, 

2?(| Xi -J- X? + • ■ • + X„ J) = tp(n) ■ (L, 

Since the limiting distribution in this case is a normal distribution with 
standard deviation n 1 and E(\ Xi + X 2 ■ ■ • + X„ |) = (2n/n)\ it follows that 
this is the asymptotic value of <p(n). 

The inequality of the theorem is only efficient when the F(| X, |) are of nearly 
the same size. In other cases it can often be usefully supplemented by the 
Lemma. If 

(i) Xi , Xj , ■ • • , X n are independent 

(ii) for each i, either X, has median zero, or the sum of the means of the other X, 
is zero (this is implied by either (a) the median of each Xi is zero, or (b) the mean 
of each X, is zero), then 

■E(| Xi X 2 -(-•■• + X n |) > Max I?(| Xt 

The lemma follows from the case where n = 2, by applying that case to 

F. = X u , F 2 = Z X.. , 

where the maximum of E(\ X, [) is attained for 1 = i„ 

The special case follows from the inequality 

| *1 + x 2 1 > J mi | -j- avsgn xi , 

since this implies 

E(\ Xi + X 2 1) > E(\ Xi |) + E(Xf) ■ E (sgn X x ) = E(X l ) 
using first d) an d then (ii). 

In conclusion, it is interesting to note that the mean cannot replace the 
median in the hypothesis of the theorem. For let Xi , X 2 , X 3 be independent, 
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and take the values 1 (with probability 2/3) and -2 (with probability 1/3). 
Xi + X 2 4- Xa takes the values 3 (with probability 8/27), 0 (with probability 
12/27), -3 (with probability 6/27) and -6 (with probability 1/27). Hence 
E{\ Xi |) = 4/5, and E(\ X, + X 2 + 1) = 48/27 - 16/9 - 4/.U'(| X, |), 

which is not > 3/2E(\ X, |). 
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ON THE INDEPENDENCE OF THE EXTREMES IN A SAMPLE 1 

By E. J. Gumbel 

New School for Social Research 

In a previous article [1] the assumption was used that the mth obaervation in 
ascending order (from the bottom) and the mth obaervation in descending order 
(from the top) are independent variates, provided that the rank m is small com- 
pared to the sample size n. In the following it will lie shown that this assump- 
tion holds for the usual distributions. 

Let a; be a continuous, unlimited variate, let * (i) be the probability of a value 
equal to, or less than, x; let <p (x) be the density of probability, henceforth called 
the initial distribution. The mth observation from the bottom i« written m x 
and the fcth observation from the top is written x k . Thus, the bivariate dis- 
tribution tonLx, Xk) of m x and % k , is such that there are m — 1 observations less 
than m x,k- 1 observations greater than x t and n ~m~ k observations between 
„x and Xk . 

For simplicity’s sake write 

$Lx) = m $; $(x*) = . 

Ip(mX) V^Xk) " <ph . 

Then 

(1) Mm®, a*) - - «$) n - m ~V*(l - **)‘-\ 

where 


(!') 


C = 


(m - I)l(ft - i)|(n - m - k) t’ 

In the expression (1) no assumption about dependence or independence of 
X k » imp > led except that these values are taken from the same population. 
The distribution (1) is now modified by introducing three conditions. First, 

1 BeSeftrCh d ° ne Whh the su PP° rt of a grant from the American Philosophical Society. 
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that the two variates are extreme, namely that the ranks m and k are of the same 
order of magnitude and small compared to the sample size n. 

(2) n > > m c* k — 0(1). 

Furthermore it is assumed that the initial distribution <p(x) is, for small and for 
large values of the variate, subject to L’Hospital’s rules 


( 3 ) 


lim - lim 


x-p— ao v(.%) 


-00 $(*) 


lim 

Z<oOO 


v'(x) 

<p{x) 


— lim 


v(x) 

1 — 4>(:e) ’ 


Finally it is assumed that n is so large that the equality of the limits may be re- 
placed by the equality of the quotients. Then it is legitimate to write 

/ < 

(3') !251 — . Vk _ Vk 

mV m$’ Vk 1 — &k 


Clearly, the three conditions do not imply any assumption about dependence or 
independence of the two extremes. 

From (1) the most probable with value from the bottom, m u, and the most 
probable Mh value from the top, Uk , are the solutions of 


m 


— 1 -mV 

-V— mV H 

mV 


n 


m — k 




mV- 


n — m — k 


, Vk _ 
= 


k - 1 


Vk 1 — $4 

These two equations may be written by virtue of (3') 


Vk 


m n — m — k _ k 

1 — * 

Consequently the probabilities of the most probable with and fcth values m u 
and Uk are 


(4) 4>( m u) = — ; $(u k ) = 1 - - . 

n n 

The expansion of the probabilities and around the modes m u and Uk leads 
[2, 3] by virtue of (2), (3), (4), to 

(5) = - c" v ; **-1 

71 71 

where 

(6) mV = — v(mU)( m x - m u); y k = ~ v{u k )(x h - u k ). 

771 K 

Therefore, distributions, subject to L’Hospital’s rules (3), may be said to be of 
the exponential type. Since the derivatives m v and if>k are 
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(7) «¥> " = «*(1 ~ 

where 

(7') m« = «* " £ ¥>(«*). 

the product of the first two and the last two functions in formula (1) may be 
written as a product of two functions 

(8) .S-Wl ~ = (-«~ ^ )(“*£«*") 

Clearly, each factor in (8) depends only on one variable. 

In the same way the function of M x and Xk m the middle of (1) can be split up 
into a product of two independent functions, each depending only on one vari- 
ate. By virtue of (5) 

<$ fc — n $ = 1 — - (me mV + ke~ n ) 
n 

and by virtue of (2) 

(9) (*fc - m $) = exp(-me" 1 0 exp ( — fop'*), 

where 


exp(x) = e*. 

From (2) the constant factor (1') may also be split into a product 

(10) = - n 7 JL [ 

(m - 1) i(fc — 1) !(a — m — fc) ! (wi — lj ! ( k — 1) ! * 

Introducing (10), (9) and (8) into (1), the bivariate distribution of the mth ex- 
treme value from the bottom and the ktb. extreme value from the top is obtained 
as a product of two independent distributions 


%k) = mi (m®) ‘Sk(Xk) 

where 

( 12 ) »/(«») = exp(rn m f / - me nV ) 

and 

( 12 '^ Mxk) = j exp (-ky k ~ ke~ Vk ) 

are the distributions of the with extreme values from the bottom, alone, and of 
the kth extreme values from the top, alone. 
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In the special case m = k and for a symmetrical initial distribution with mean 
zero, the following equations hold 

(13) j mV> = Ufa = Um ■ 

(13') = 1 — $4 = 1 — $ m ; m <p — <p k = <p m . 

and the bivariate distribution of the mth values from the bottom m x, and from 
the top x m , is 

(14) lUnCn-G %m) = mf(.mX) 'frr. (%rn) J 
where 

(14') mfimX) = /m(~ %m) 

is the expression used in the beginning of article [1] 

It follows from (11) that the mth observation in ascending order, and the fcth 
observation m descending order, may be dealt with as independent variates 
provided that n is large, the ranks m and k are small, and that the initial con* 
tinuous unlimited distribution is of the exponential type as defined by equations 
(3). 
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A NOTE ON SAMPLING INSPECTION 

By Paul Peach and S B. Littauer 

North Carolina State College and Newark College o/ Engineering 

In designing an industrial sampling plan conformable to the Pearson-Neyman 
approach, the operating characteristic is made to pass as nearly as possible 
through two predetermined points Wald [1] has used this method for setting Up 
sequential sampling plans. 

A similar type of single sampling plan can be designed by using tables of the 
incomplete Beta function. Unfortunately, tables of this function are not 
generally available, and the existing tables do not cover the range for large 
sample sizes. 

An approximate solution of the problem for single sampling can be based on the 
widely available tables of percentage points of the chi-squaie distribution. This 
is equivalent to assuming a Poisson distribution of defectives in the sample, 
utilizing the well known fact that for even degrees of freedom the chi-square 
distribution gives the summation of a Poisson series. 

We use the following well established notation: 
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n = sample size 

c = acceptance number 

Vl = acceptable fraction defective 

p, = objectionable fraction defective 

« = risk of rejecting a lot if p « Pi • 

/3 => risk of accepting a lot if p ~ pi . 

There seems little to be gained by using a large assortment of possible risk 
values, since the necessary adjustment to secure ft desired effect can bo made 
on the p’s. We suggest the adoption of .05 as a standard value for both a and $■ 
This convention conforms to much existing statistical practice, in particular to 
some existing inspection tables. 

We propose also the use of 

fie = Px/Pi , 

which we call the “operating ratio,” as a measure of the power of discrimination 
of an inspection scheme. Dodge and Romig [2] used what is essentially the 
reciprocal of R 0 as a basis for the construction of sampling plane. N nw , assume a 
binomial distribution of defectives in samples and ft series of single sampling 
plans with the same c but different n. As n increases, the effective values of 
pi and pi. clearly decrease. Their ratio R<s is not constant, but it does not change 
very much after n has got beyond the range of very small samples -my 5 (c -{- 1). 
The value obtained from the chi-square table is the upper limit of R o for a fixed c 
and increasing n. Since Ro is to a first approximation a function of c alone, 
provided n is not very small, it is a useful index for the construction of tables, 
and gives great compactness. 

Using the chi-square approach, we note that 

D, F, = 2c -f" 2 


npi = bxie+i,l-a 
ftP2 = ^X2s4l.3 

r 0 = *khL , 

Table I gives J? 0 , c, and npi over a considerable range, with a «*0 .05, 

Given pi and we calculate R a and use it to enter the table; c is read off directly, 
and the sample size is n = npi/pi . 

Sample sizes obtained in this way will be too large when the true distribution 
of defectives follows the binomial or hypergeometric laws. There is, however, ft 
gain in protection due to the extra inspection. For the binomial case the exact 
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TABLE I 


Single sample inspection plans 
a = {3 = ,05 


jRo 


npi 

58. 

0 

.051 

13. 

1 

.355 

7.5 

2 

.818 

5.7 

3 

1.366 

4.6 

4 

1.970 

4.0 

5 

2.61 

3.6 

6 

3.29 

3.3 

7 

3.98 

3.1 

8 

4.70 

2.9 

9 

5.43 

2.7 

10 

6.17 

2.63 

11 

6.92 

2.53 

12 

7.69 

2.44 

13 

8.46 

2.37 

14 

9.25 

2.30 

15 

10.04 

2.24 

16 

10.83 

2.19 

17 

11.63 

2.14 

18 

12.44 

2 10 

19 

13.25 

2.07 

20 

14.07 

2.03 

21 

14.89 

2.00 

22 

15.72 

1.92 

25 

18.22 

1.81 

30 

22.44 

1.71 

37 

28.46 

1.61 

47 

37.20 

1.61 

63 

51.43 

1.335 

129 

111.83 

1.251 

215 

192.41 


In view of the approximate nature of this table due to the Poisson distribution, 
it is suggested that when the calculated value of Ro does not appear, the table be entered 
with the next larger value. This rule will result in partial compensation for the 
approximation. 
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values 'Pi and ps for a given n and c can be calculated, using a table of the 
5 per cent points of the F (variance ratio) distribution. We may take 

m = 2 (n — -c) 

Mj = 2(c + 1) 

Fi = F(ni , nt) 

F 2 = F(nt , ni) 


Then 

nz 

^ 1 riz -+• ni Fi 

and 

nzFz 

ni + nzFz ' 


utilizing a property of the F distribution pointed out in [3], page 2. 
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ON AN EQUATION OF WALD 

By David Blackwell 
Howard University 

Let Xi , Xt , • ■ • be a sequence of independent chance variables with a com- 
mon expected value a, and let S\ , & , • ■ ■ be a sequence of mutually exclusive 

oO 

events, St depending only onli, • • • , Xu , such that £ P(St) - 1. Define 

the chance variables n = n(Xi , 1 ,, • • *) = k when S k occurs and W ~ Xi -f- 
■ ‘ • + X * . We shall consider conditions under which the equation 

( 1 ) EQV) = aE{n), 

due to Wald [3, p, 142], holds. 

This equation has various interpretations: 

A. n may be considered as defining a sequential test on the Xi . If a and 
E{W) are known, (1) may be used to determine E (n) , the expected number of 
observations required by the sequential test, [3, p. 142 et seq]. 

B. n may be considered as representing a gambling system, i.e, it represents 
the point at which a player decides to stop. W then represents his winnings, 
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and (1), in the special case a = 0, says that, if each play is a fair game, then the 
system leads to a fair game. 

O. n may be considered as the duration of a random walk. The meaning of 
W and (1) is obvious. 

More exactly, we shall investigate conditions on X< under which (1) holds 
for every test n of finite expected value. Our results, Theorems 1 and 2, are 
that (1) holds if the X, have identical distributions, or if they are uniformly 
bounded Theorem 1 is a generalization of a result of Wald [3, p. 142]. 

The test n may be considered as a test on the variables F, = X, — a. Then 
W = Fi + • • • + Y n = W — na, so that E(W') = 0 is equivalent to (1) for 
tests of finite expected value. Thus it is no loss of generality to assume a = 0 
and to seek conditions under which E{W) = 0. We remark that if E(ri) does 
not exist, then E(W) need not be zero. For example define X; = ±1 with 
probability |, and n as the smallest integer k for which Xi + • • ■ + Xh ~ 1. 
Then E(W) = 1. (It follows from Theorem 1 or 2 that E(n) cannot exist, which 
can also be shown directly.) 

Theorem 1. If X i , X 2 , ■ ■ have identical distributions, E{X,) = 0, E(n) < 

00, then E(W ) = 0. 

Proof: Define chance variables n k inductively as follows: n% = n. Supposing 
«ii ,n k to be defined, define n k+l — n(X„ l+ . . + „ i+ i , X„ 1+ , . +ni+i , ••■) 

1. e. «i , n g , ■ • ■ are the successive values of n obtained by iterating the test. 
Then 


( 2 ) 


Pint ,•••,«*; nie+i = j) = P(Sf). 


For the event {n k = <h , ■ - ■ , n k = a*] - R depends only on Xi , •. < ■ , X ai+ .,, +0l , 
while under the hypothesis R the event {n k +i = j) coincides with the event S = 
{n(X ai+ ... +ak+u •••) = j}. Thus P*GS) = P(S). Finally P(S ) - P(S,) 
since S is defined by imposing the same conditions on X„ 1+ . , +0i +i , ■ • ■ that Sj 
imposes on Xi , • • • , X, . (2) shows inductively that ni ,«»,•• • are defined 
everywhere and are mutually independent with identical distributions. Now 
define Wic = X flI+> .. +ni _ 1 +!+■••+ X„ I+ +„ k . A similar argument shows that 
Wi(= W), Wi, • ■ ■ are also independent variables with identical distributions. 
The strong law of large numbers [2, p, 488] asserts that, with probability one, 

(o) ^ > 0 as N — * . 


It follows that, with probability one, 

Wi + • • • + 

n i + • ■ ■ + n k 

Wi+ • • • + W k 


For if 


then 


Til -f- 

Xi + • 


• + n k 
+ X* 


N 


> e 


> « 


— > 0 as k > 0. 

for an infinite number of k, 
for an infinite number of N, 
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which by (3) is an event of probability zero. Also from the strong law of large 

numbers n -^ " ' i~ — ■ -> E(ri) with probability one. Then 
k 

Wi±^±w* = ( Wi + ■■■ + Wk V ni + ■ • • + Q 

k \ wi + • * ■ -b / \ k ) 

with probability one. It follows from the converse of the strong law of large 
numbers [2, p. 488] that E(Wi) = E(W) = 0. 

Write Si +••■ + & = Uk , C(U k ) = 7* so that 7* = (n > fc) . Then (a) 
7, depends only on Xj , • • * , X k , (b) 7i D 7 2 3 • • ■ , P{V k ) -> 0. Conversely 
any sequence of sets Vi satisfying (a) and (b) defines a sequential test on X { ; 

define n = k on 7*_iC(7»). Moreover E(n) < «> if and only if (c) £ P{V*) 

converges [1, p 297]. Now 

E(W) = lim £ f (Xi +■■■+ X k ) dP = lim £ f (Xi + • ■ • + X„) dP 

K— at fc-L N—* k-i J Bl, 

= lim ( ( Xi + •••+• X K ) dP — —lim f (Xi + ** * Xk) dP , 

K—& Juk K—k Jvk 

This establishes the following 

Lemma: If E(Xi) = 0, then E{W) = 0 for every test of finite expected value if 
and only if for every sequence of sets V N satisfying (a), (b), (c), 



+ Xk) dP -> 0. 


From this condition we obtain easily 
Theorem 2. If E(Xi) = 0, | Xi ] < M, E(n) < <» , then E{W) = 0. 
Proof: If 7 k is a sequence of sets satisfying (a), (b), (c), then 


f (X,+ +Xk)AP 

Jr N 


< MNP{V k ), 


Now the series 2 P(F„) is a convergent series with decreasing positive terms. 
It is well known that under these conditions NP[V n ) — » 0. It follows from the 
lemma that 13(17) = 0 

The question of finding sufficient conditions for E(W) - 0 more general than 
those given in Theorems 1 and 2 is of interest. The bare condition E(X<) — 0 is 
not sufficient, as the following example (which is simply the system of doubling 
the stake) shows: X, ± 2‘ with probability n is the smallest integer k for which 
X* > 0. A simple computation shows E(n) = E(W) = 2. It is well known 
that the expected amount of capital required for the above system is infinite. 
That this is generally true for such systems is shown by the following theorem, 
in which no hypothesis is made concerning the existence of E{n) . 
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Theorem 3. If E(X,) = 0, E(W) > 0, then E(Z) = — », where 
Z = min (Jfi + ■ • • + Xu). 

i£fi 

Proof: It follows from the proof of the lemma that 

f (Xi+ ••• + X K ) dP -E(W). 

J r y 

Now on V K , Z < (Xi + *•■ + X N ). Hence 

lim f ZdP < — E(W). 

o JFw 


Thus E(Z) cannot exist if E(W) > 0, since P(V*f) — >0. Since Z < X f Z dP 

Jz&o 

exists; consequently E(Z) = — w. 
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CORRECTION TO THE PAPER “ON A PROBLEM OF ESTIMATION 
OCCURING IN PUBLIC OPINION POLLS" 


By H. B. Mann 
Ohio State University 

In the paper “On a problem of estimation occurring in public opinion polls” 
(Annals of Math. Stat., Vol. 16 (1945), pp, 85-90) the author made the assertion 
that, in the notation of the paper, E[(e< — r,) 2 ] is always smaller than E[(t { — e<) 2 ]. 
This statement is incorrect and its supposed proof contains a numerical error 
in the fourth line from above on p. 90. 

We have 


E(r!) -jsL. L L 2^1 “ p [“S: *<*' »■ 

i a r r r 1 ^ / 2 , a 

2x V3 W i/V5 ^ L 2 3 ^ + V 


<r t 


p 4 )J dx dy dp< 
xy) \ dx dy 
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The last integral is tabulated in Karl Pearson’s Tables for Statisticians and 
Biometncians, Vol. 2, p. 93. Comparing this table with a table of the normal 
probability integral it may be seen that there exists a value 5 such that 

E(e 2 ,) > E(r)) for c < 5, 

E(e\) < E(r 2 ,) for c > 5. 

The quantity c lies in the neighborhood of 2, 

I am indebted to Professor J. W. Tukey for bringing the error to my attention. 



NEWS AND NOTICES 

Readers are invited to submit to the Secretary of the Institute news items of interest 

Personal Items 

The following members of the Institute are teaching in Army University Cen- 
ters in Shrivenham, England; Biarritz, France; and Florence, Italy: T. A. 
Bancroft, Alonzo Cohen, E. E. Blanche, P. R. Rider. 

Dean Walter Bartky of the University of Chicago has been appointed as the 
representative of the Institute of Mathematical Statistics to the Division of 
Physical Sciences of the National Research Council. 

Mr. Clyde A. Bridger represented the Institute at the Inauguration of Dr. 
F. S. Hams as President of Utah State Agricultural College on November 16. 

Dr. C. West Churchman has resigned his position at Frankfort Arsenal and has 
accepted the appointment of Assistant Professor of Philosophy at the University 
of Pennsylvania. 

Assistant Professor D. B. DeLury of the University of Toronto has been ap- 
pointed to an associate professorship at Virginia Polytechnic Institute. 

Mr. George Eldredge, formerly with the Aluminum Research Laboratories at 
New Kensington, Pennsylvania is now corrosion chemist with the Shell De- 
velopment Company at Emeryville, California. 

Dr Will Feller of Cornell University has been appointed as the representative 
of the Institute of Mathematical Statistics on the Policy Committee of the 
Mathematical Organizations. 

M. Bernard Hecht has joined the International Resistance Company, Phil- 
adelphia, as head of the Quality Control Department. 

Lt. Col. Paul Horst has returned to his previous position at Proctor and 
Gamble at Cincinnati. 

Professor Harold Hotelling of Columbia University has been made a part time 
consultant on statistical problems to the Division of Statistical Standards of 
the Bureau of the Budget. 

Dr. S. B. Littauer is now chairman of the Mathematics Department of New- 
ark College of Engineering at Newark, N. J. 

Lieutenant Commander A. L. O’Toole has been decorated with a Bronze 
Star Medal for his outstanding service in the South Pacific during the past two 
years. 

Associate Professor H. II. Pixley of Wayne University has been appointed 
Assistant Dean of the College of Liberal Arts. 

Dr. H. B. Mann has been appointed to an associate professorship at Ohio 
State University. 

Miss Dorthy J. Morrow has been appointed to an assistant professorship at 
George Washington University. 

Professor C. J. Rees of the University of Delaware has received a citation for 
his work in a civilian capacity with the I4th Air Force Headquarters. 

89 
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Dr. L, V. Toralballa is ft special instructor in the Mathematics Department at 
the University of Michigan. 

Associate Professor Abraham Wald of Columbia University has been promoted 
to a professorship. 

Mr Grover C, Wirick, Jr. is doing graduate work at the University of 
Michigan. 

Henry Goldberg of the Columbia University Statistical Research Group died 
April 19, 1945. 


During the last quarter of 1945, many members of the Institute engaged in 
statistical quality control were favored by visits from Messrs. W. A, Bennett and 
M Milbourn, the successful candidates in a scholarship competition organized 
by the Quality Control Panel associated with the Midland Region of the British 
Ministry of Production. In addition to the competition, for which with a three 
months’ trip to the United States as a prize, 92 papers on industrial applications 
of statistical methods were submitted. This Panel has been active in organizing 
regular discussion groups and in arranging courses of lectures at the Birmingham 
Technical College, later published by the Birmingham District Committee as 
a “Symposium of Papers on Quality Control”, copies of which arc still available. 

Mr. Bennett is Works Manager of the English Needle and Fishing Tackle 
Co., Ltd,, of Kedditch, and Mr. Milbourn is a physicist who has worked mainly 
in the field of spectrographic analysis and physical metallurgy in the. Research 
Department of Imperial Chemical Industries, Metals Division, Birmingham, 
It is natural, therefore, that Mr. Bennett’s paper dealt with the management 
problem of organizing a Statistical Quality Control Bureau and defining its 
duties, whereas Mr. Milboum’s paper considered the operation of quality control 
techniques as a means for detecting and identifying causes in production research. 

Toward the close of their visit in this country they indicated that the future 
of Quality Control, both here and abroad, will depend on establishing an adequate 
theory of control that includes statistical along with all other necessary factors. 
This provides a challenge that must be answered by the statistical societies and 
the colleges, as well as by the quality control people. 


New Members 

The following persons have been elected to membership in the Institute ; 

Bat, Kenan Y. (Columb.a) Statistical Control, Hq. AFPDC, 830 West Broadway, Dullsville 
3, Kentucky 

Coles, James Stacy, Ph JO. (Columbia) Research Supervisor, Underwater Explosives Re* 
v r »nv B n h ^ at0 Z’ W °° de Hole ’ 0oeano K ra P |, lc Institution, Box 631, Wood* Hole, Mam. 
F York 'y Admml8tratm A8S ' t '’ 10(18 l8laUd CRy n,S ” 4U W ' 1Hth 8u * New 
Orelder^C. Edwin, Jr., B.A, (Michigan) Actuarial Clerk, 1086 Olenumd Bled., Schenectady 

Gul ton* N P J° f H * r0ld ’ Ph D ‘ (Chicftgo) p8ychol °£y Dept., Princeton University, Prince- 
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Harrison, Joseph 0„ Jr., B.S (George Washington) 8605 Kmgsbridge Are., Apt SF, New 
York, NY 

Hodges, Joseph Lawson, Jr v A B (California) Operations Analyst, Army Air Forces, 
1867 Park Road, N.W , Washington 10, D. C. 

Hoskins, Robert Heywood, A B. (Harvard) Radio Technician, Third Class, U. S Navy 
Teaching Fellow in Mathematics, Harvard University, Separation 8, Separation Center, 
Shoemaker, California 

Lowry, Edward D, Statistician (Western Cartridge Co., E Alton) 60S 5th St., East 
Alton, III. 

Rees, Prof. Carl J,, Ph.D (Pennsylvania) Head of Math Dept., Umv, of Delaware, 
Newark, Delaware 

Seth, Gobind Ram, M.A (Delhi) Lecturer in Math. Hindu College, Delhi (On Leave) 
13^6, John Jay Hall, 116th Street, Columbia University, New York 87, N Y. 

Silber, Jack, B.S. (Chicago) 4908 N . Springfield Are , Chicago 85, III 

Stone, Goldie F., A M. (New York) 678 Dawson Si , Bronx, New York, N. Y. 

Szatrowskl, Zenon, Ph.D. (Northwestern) Instructor in Economics Department, North- 
western University, Evanston, 111. 

Wadley, Francis Marlon, Ph D. (Minnesota) Statistical Consultant, Bur. of Entomology 
and PI. Duar., USDA, 3816 N. Albemarle, Arlington, Virginia 

Waugh, Frederick V., Ph D (Columbia) Agricultural Economist (Office of War Mobil, 
and Recon.) 1006-86 Street, South, Arlington, Virginia 



REPORT ON THE CLEVELAND MEETING OF THE INSTITUTE 


A meeting of the Institute of Mathematical Statistics was held in C’leveland, 
Ohio, Thursday to Sunday, January 24-27, I94G in conjunction with the Annual 
Meetings of the American Statistical Association and the Kconomef rin .Society, 
The following 115 members of the Institute attended the meeting: 

Beatrice Aitchison, Armen A. Alcluan, Franz L. Alt, Richard L Andewm, Kennel h J. 
Arnold, Max Astrachan, George J Auner, Kenan Y, Hal, Walter Rartky, William D Hnlen, 
Harold R Bclliaon, Archie Blake, Chester I, Bliss, Alliert II. Howker, T. II. Brown, Robert 
W Burgess, Oscar K Buros, Irving W Burr, Burton II, Camp, (’. West Churchman, Wil- 
liam G Cochran, Edward P. Coleman, Francis G Cornell, Jerome Cornfield, Donald R. (i. 
Cowan, Dudley J Cowden, Gertrude M Cox, John H Curtiss, Joseph F. I)aly, Cuthhert 
Daniel, Besse B Day, Walter L. Deemer, Jr , Daniel B. DeLury, W. Edwards Denting, 
Bernard Dempsey, Paul S Dwyer, Churchill Eisenhart, Mary I/. Klvehaek, Benjamin 
Epstein, Wilmoth D. Evans, Carl H Fischer, Irving Fisher, T, X. E. Greville, Trygve 
Haavelmo, Clausin D Hadley, Margaret J, Ilagood, K. W. llalliert, Morris If, Ihumen, 
Boyd Harshbarger, Byron 11. Hayden, Harold Hotelling, Karl E, Hmiseman, Lemhi) Hur- 
wicz, William Hurwitz, Calvin J Kirchen, Lila F. Ivnudscn, Hendrik H. Kmitjn, Tjalling 
ICoopmans, Morton Kramer, Anita It. Kury, Robert Ladd, Diekaon II, Leavers. R„ v I,cip- 
mk, E Vernon Lewis, Eugene Lukacs, Henry Ii Mann, George E. T. Mayer, Edward C. 
Molina, Alexander M Mood, Margaret Moore, Joseph K, Morion, Frederick C. Miwtel' 
ler, Charles McC. Mottley, Paul M Neurath, Horace W. Norton, Edwin G Olda, Paul S. 
Olmstead, Guy H Orcutt, James G, Osborne, Russell F. Pansano, Paul Peach, Alice K. 
Andrews Priestley, James Rafferty, Sophie Itakesky, Charles F. Rons, A C, Rncamlcr' 
Herman Rubin, Phillip J. Rulon, MarmnM Sandomire, Franklin K. Sal I cr I hwaifr Father 
Schaeffer Edward M Schrock, David II, Schwartz, G. It, Seth, Lawrence W. Shaw, Jack 
Sherman, Walter A, Shewhart, Walt R, Simmons, Leslie E. Simon, John H. Smith, J. R 
Steen, Joseph Steinberg Henry W. Steinhaus, J. W. Sullivan, Zcnon Szatrowski, Ben- 
jamin Pepping, John W. Tukey, Helon M. Walker, W. Allen Wallis -V F R Went man 

YnL!" 1 " 1 E "'* botl ‘ w - “**■ »• Wi “" r ' wSi k 

The first session of the meeting was hold jointly with the American Statistical 
Association on Thursday afternoon on Numerical Solution 0 / Rajrnma Ham- 
tons under the chairmanship of Dr. W. E. Deming of the llsnu of tire Budget 
Ine following papers were presented: h 

L i U G^o!Z,fZ rma T n f c r r ' ,a " m mi C”#c 

Ur Guy Orcutt, Massachusetts Institute of Technology 

2 t Sq n T l R T Melku fJ or lhe Solution of Regression Equations. 

Mr D B Duncan, Royal Australian Air Force 

3 Error Control in Matrix Calculation. 

. ™ l E Sat ‘ erthwait °i Aetna Life Insurance Company 

4 lhe Compact Computation of Canonical Correlations 
Professor P S. Dwyer, University of Michigan, 

° f a raomin * and a « afternoon section, 
held jointly with the Econometric Society and the American Sfnfmfir.nl 
Association on ft™*,, BtMms }fm Xon^rinenM ^ZlT ^. 
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Mordecai Ezekiel acted as chairman, of the morning session and Dr. R. L. 
Anderson was chairman of the afternoon session. Of the following four 
papers, the first two, were presented in the morning and the last two in the 
afternoon: 

1 The Economist’s Problem of Statistical Inference 
Professor J Marschak, Cowles Commission 

2. Prediction and Structural Estimation, 

Mr. Leonid Hurwicz, Cowles Commission 

3 Iterative Computation Methods m Estimating Simultaneous Relations 
Dr T Koopmans, and Mr. Hoy B, Lcipnik, Cowles Commission 

4 Multivariate Analysis in Economies 
Professor Gerhard Tintner, Iowa State College 

On Friday afternoon a session on Experimental Designs and their Analysis 
was held jointly with the Biometrics Section of the American Statistical Associa- 
tion under the chairmanship of Professor Gertrude Cox of North Carolina State 
College. The following papers were presented: 

1 On the Uses of Orthogonal Functions tn the Analysis of Incomplete Latin Squares 
Professor D B DcLury, Virginia Polytechnic Institute 

2 Use of Adjusting Factors in the Analysis of Data with Disproportionate Subclass Num- 
bers. 

Professor R. E. Patterson, Texas A. and M. College 

3. Selection of Sample Size for Delecting Treatment Differences 
Professor A M, Mood, Iowa State College 

4 Rectangular Lattices 

Professor Boyd Hershberger, Virginia Agricultural Experiment Station 

On Saturday, a two-scssion symposium was held jointly with the Econometric 
Society and the American Statistical Association on Sampling in the Social 
Sciences. Professor Arnold J. King of Iowa State College acted as chairman for 
the morning session and Professor S. S. Wilks of Princeton University presided 
in the afternoon. The following seven papers were presented, of which the first 
three were presented in the morning and the remainder in the afternoon: 

1. Problems and Methods of a Sample Survey of Business. 

Mr. M. II. Hansen, Bureau of the CensuB 

2. Problems of Area Sampling in Agriculture. 

Mr J. It. Goodman, Bureau of the Census, and Mr. E. E. Houseman, Bureau of Agri- 
cultural Economies 

3 Problems of Area Sampling in Population. 

Mr. B. J. Topping and Mr. J, S Steinberg, Bureau of the Census 

4 The Problems of Non-Response. 

Mr W. N Hurwitz, Bureau of the Census 

5. Systematic Sampling and its Relation to Other Sampling Designs. (Read by Title.) 
Mrs, Lillian II Madow, Washington 

6. Relative Accuracies of Systematic and Stratified Random Sampling for a Specified Class 
of Populations. 

Professor W. G Cochran, Iowa State College 
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7 On the Design of a Sample of Dealers' Inventories. 

Dr. W E Deining, Bureau of the Budget and Dr. Willard Himmoms, Office of Price 
Administration 

On Sunday, a symposium was held jointly with the American Statistical 
Association on Acceptance Sampling under the chairmanship of Professor John 
W, Tukey of Princeton University. The morning session of the symposium 
was devoted to acceptance sampling by attributes and the afternoon session to 
acceptance sampling by variables. The following program was presented at the 
morning session: 

Papers 1 

1 Prewar Developments. 

Mr Paul Peach, North Carolina State College 
2, Wartime Developments. 

Professor E G, Olds, Carnegie Institute of Technology 
Prepared Discussion by. 

Mr. H R. Beilinson, Army Ordnance Department 

Mr D H Schwartz, Quartermaster Corps 

Professor Walter Bartky, University of Chicago 

In the afternoon session the following program was presented: 

Papers 

1 Lot Quality Measured by Average or Variability, 

Lt Commander J. H. Curtiss, Bureau of ships 
2, Lot Quality Measured by Proportion Defective. 

Mr, W. A Wallis, Columbia University 
Prepared Discussion: 

Mr E M. Sehrock, Army Ordnance 
Professor A, M. Mood, Iowa State College 
Professor K. J. Arnold, University of Wisconsin 
Lt Commander J. F. Daly, Bureau of ships 
Dr. A, E. R, Westman, Ontario Research Foundation 

A business meeting of the Institute was held at 5 p.m, on Saturday afternoon 
at which time reports were made by the President, Secretary-Treasurer, Editor 
and Chairman of the Committee on Development. These reports are all 
printed in the current issue of the Annals. 

Paul 8. Dwyer, 

Secretary, 



ANNUAL REPORT OF THE PRESIDENT OF THE INSTITUTE 

(For 1945) 

I, Development op Public Appreciation for Mathematical Statistics 

The aims of the Institute, as stated in the constitution, are to promote the 
interests of mathematical statistics. First and foremost, research must go on. 
The Annals must be published and its position maintained as the world’s leading 
journal in mathematical statistics. Meetings must be held to provide for further 
dissemination and discussion of research. But this is not all. We should fall 
short of our opportunities for promoting the interests of mathematical statistics 
if we were to lose sight of the need for creating an environment in which mathe- 
matical statistics and statisticians can thrive and take their proper place for 
rendering the service that they are capable of rendering in the political, industrial, 
and scientific life of the nation, 

A fair share of the efforts of the officers and committees of the Institute this 
past year has been devoted to the creation of this environment. The Institute 
has assumed leadership in several movements of importance in this direction 
and has lost no opportunity to cooperate with other organizations toward the 
same ends. Momentum has thus been given to important developments which 
are bound to affect the scientific advancement and employment opportunities 
of all people engaged in statistical work of any kind, whether it be mathematical 
research, consulting, teaching, major or minor roles in large-scale statistical 
projects, preparing questionnaires, designing experiments, analyzing results, 
formulating conclusions and recommendations, or taking part in any other way 
in the collection or use of statistical data. Briefly, these developments fall 
under three main headings. 

(i) Selling standards of professional competence. . The Description of the Pro- 
fession of Statistics, put out by the National Roster this year, has gone a long 
way as a first step toward setting standards of professional competence. The 
officers and many members of the Institute assisted the Roster, particularly 
Professor Harold Hotelling and his Committee on the Teaching of Statistics, 
together with Dr. C. I. Bliss representing the American Statistical Association. 
Although the Roster Description is not intended to represent the official attitude 
of the Institute, it does represent cooperative effort toward cultivation of public 
understanding of statistical work. 

(«) Raising the standards of leaching. Standards of teaching go hand in hand 
with standards of professional competence. The Institute can proudly point 
to the accomplishments of its Committee on the Teaching of Statistics, which 
under the chairmanship of Professor Hotelling, has persistently set forth stand- 
ards of teaching which are bound to bring about important changes in the ar- 
rangement of statistical courses and organization of statistical teaching. An 
inevitable result will be greater competence in statistical theory, better research, 
and expanding avenues for more effective application of theory. 

95 
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(in) Promoting public understanding and appreciation for the statistician. 
More adequate public appreciation of statistical theory can be brought about in 
several ways. The first two of these are being actively pursued by the officers 
of the Institute The third constitutes a proposal; and the fourth, an obligation 
incumbent on every member of the Institute. 

First, through joint meetings with other professions such as sociologists, 
economists, psychologists, engineers, biometrioians, etc. The Cleveland meet- 
ing is an example, the St. Louis meeting of the A A AS to be held in March is 
another. These joint sessions give opportunity for other groups to become 
aware of the impact of mathematical statistics on their own work, and for mathe- 
matical statisticians to hear of the statistical problems in other fields. ( ipportu- 
nities for such diffusion of knowledge exist in local chapters as well as in national 
meetings, and every member of the Institute should be on the lookout for oppor- 
tunities to explain how problems in administration, management, economics, 
and manufacturing, are going to require modification in the future owing to new 
work in sampling techniques, acceptance procedures, quality control, and other 
developments of mathematical statistics. 

The federation of statistical sooieties (see Part III) will afford better means 
than existed heretofore for an admixture of mathematical statistics with fields 
of application, both in national and local meetings. 

Second , through the work of committees whose responsibility is to advise 
professional groups, and government and private, research agencies, concerning 
the use of mathematical statistics, A notable example is the Joint Committee, 
for the Development of Statistical Applications in Engineering and Manufactur- 
ing, of which Dr. W. A. Shewhart is chairman. The Institute has two repre- 
sentatives on it. Much of the lecent advancement of statistics in industry is 
traceable to the work of this committee. 

Third, through the establishment and publication of colloquium lecturer as 
recommended by Dr Shewhart in his report for the preceding year, or of an 
annual Rietz lecture of broad interest as recommended by this years' Committee 
on Development (cf. Appendix A, Part V). 

Fourth, information through expository nonmathematical articles and lectures 
delivered by leading mathematical statisticians before gatherings of mmstalisti- 
cal groups of professional and business men. Such activity is of course informal 
and without record, carried on by individuals as opportunity permits ami not by 
official announcement from the office of the Institute. 

II. Long-range Planning 

Through the work of several of the Institute’s committees, each tackling 
specific areas of enquiry, the Institute is being provided with long-range policies 

and planning. In particular, the reports of the following committees should be 
cited in this connection: 

The Committee on Development (Appendix A) 

The Committee on the Teaching of Statistics (Appendix R) 
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The Committee on Finance (Appendix C) 

The Committee on Policy in Regard to Local Chapters (Appendix D) 

These committees are obviously alive to the recent rapid expansion of mathe- 
matical statistics in industry and government, and to the opportunities that lie 
ahead for developing proper environment fm greater expansion and service of- 
mathematical statistics. 

III. Federation of Statistical Societies 

A movement of extreme importance to all statistical workers is the proposed 
reorganization of the American Statistical Association as the central organization 
for all statistical societies. This movement owes its impetus largely to the 
recommendation made by our Committee on Development a year ago, and to 
the active part that our officers and representatives played in organizing and 
assisting the Inter-Society Committee. This movement is centripetal and 
replaces the centrifugal forces that were splitting statistical organizations. 
Under the new arrangement, statistics will possess a united front on matters of 
common interest, yet each organization will maintain its autonomy. Nothing 
is to be sacrificed in the way of standards of membership, meetings, or publica- 
tions Economies will be effected through combined office operations. Much 
will be gained through coordinated effort; wide distribution of a journal of 
general methodology and applications; development of public appreciation for 
statistical work through dissemination of reliable information concerning statis- 
tical science and its contributions; cooperation with local and international 
statistical groups; promotion and development of professional standards of 
statistical work; and through cooperation with other professional groups in 
fields of application. 

This federation is not yet accomplished; it is still in process of formulation, but 
it is probably safe to say that agreement on general aims has been reached, as 
well as on many items of detail. The proposition will in time be put up to each 
statistical organization for acceptance. 

IV. Growth and Expansion 

During the year the membership increased from 606 to 777. The work of 
the Institute, vitally affecting many thousands of statistical workers through 
its efforts to enhance public confidence and appreciation for theoretical statistics 
as well as to improve the quality of statistical work, extends far beyond the en- 
vironment of its nearly 800 members. Concerted drives for membership should 
continue, but should not be expected to take the place of personal invitation 
in the form of explanation, one man to another, of what the Institute stands for. 
The outlook is encouraging. Year by year as the work and influence of the 
Institute receive wider success and recognition, more and more people will be 
found ready and desirous of joining. 
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V. Administrative Affairs 


As with auy active organization, there are certain chores to be done and inter- 
nal affairs to be administered. The chief burden falls on this executive officer, 
our Secretary-Treasurer, Paul S. Dwyer, who is expected 

i. To keep the list of members up to date with addresses and titles. Furnish 
information to the Board regarding increases and decreases m member- 
ship, and issue the Directory. 

ii. To send out notices, to keep the membership informed concerning meet- 
ings and other items of interest. 

hi. To send out bills, and keep the books showing payment of dues and sub- 
scriptions, 

iv. To fill orders for back numbers of the Annals. 

v. To estimate the probable demand for copies of the Annals, current and 
past, and to place orders with the printer to be able to supply the demand. 

vi. With the Committee on Finance, to keep the Board posted on the ex- 
pected expenditures and income for the year ahead. 

vii. To answer correspondence from other organizations and individuals who 
desire information concerning the Institute, 

viii. To keep a record of proceedings of the Board and business meetings of the 
Institute. 

ix. To work with the various committees of the Institute, keeping them in- 
formed and in line on policy, constitution, by-laws, and other commit- 
ments. 

x. With the Committee on Programs, to arrange sessions of contributed 
papers, and to find space in hotels or elsewhere for holding meetings and 
housing members. 

xi. To keep the Board informed concerning recommendations and reports of 
committees, and other matters brought to his attention requiring action 
by the Board. 


xii To conduct continuous membership and subscription drives with or with- 
out the aid of committees. 

It is obvious that when an organization reaches the size and activity of the 
nstitute, these duties are too onerous to carry on without proper assistance, 
ur Secretary-Treasurer should be freed for proper performance of important 
functions which only he can render toward the growth and vitalization of the 
institute Consideration is being given to two possible plans, either of which 
wi cal or some increase in expenditure. One plan is to provide competent and 
sufficient assistance in the office of the Secretary-Treasurer, and the other is to 
transfer some of his duties (e.g. Items i, ii, iii, iv, x, and xii) to the American 
S!! Ao f 10n ° n , a cost bafiis ' A cooperative arrangement of this kind 
T *0+ , 6 xr n 6 anb ^ n8 ^ tui ' e haa been discussed informally with Mr. 

e mi tbe A " 8,A '’ who ^ be able to provide us with coat 

stimates a little later. This kind of arrangement would be a first step and serves 
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as a pilot study in cost-accounting for the ultimate federation of statistical 
societies (Part III), 

The constitution must be revised, and a committee has been formed to under- 
take the task. The one we have has served well, with minor revisions, over 
the first ten years in the life of the Institute, but conditions are now different and 
thorough reconsideration is needed. Among other things, it needs to be revised 
to permit federation with other statistical societies. As it stands it is totally 
deficient in specifying responsibilities between local chapters and the parent 
society. It should embody the recommendations of the Committee on Policy 
in Regard to Local Chapters, or modifications of these recommendations. Also, 
there are ambiguities in the prtesent constitution that need to be cleared up, and 
there is no provision for carrying out the business of the Institute by correspond- 
ence when a Board meeting or Committee meeting can not be held. 

The Committee on Meetings must not only seek out suitable papers for meet- 
ings, carrying out the wishes of the Board in regard to the subject-matter to be 
covered, but must also be concerned with the geographic location of meetings, 
cooperation with other professional societies, and choice of dates. During the 
past few years, in addition, this committee has had to contend with restrictions 
on transportation and hotel space. The Committee on Finance must decide 
what expenditures are wise and allowable; they must make decisions on in- 
vestments and surety bonds. They have calculated the price of life-memberships 
for purchase at various ages. Committees on Membership and on Subscriptions 
must be active. The services rendered by these committees deserve the grateful 
thanks of the members of the Institute. 

Undoubtedly the most lasting contribution that is bemg made by the Institute 
to research in mathematical statistics is the publication of the Annals of Malhe- 
mahcal Statistics. Without some first-hand knowledge of the problems that are 
encountered in publishing a professional journal of high standing it is hardly 
possible to be conscious of the depth of the debt owed by the Institute to Dr. 
Samuel S. Wilks, Editor. During the past few years, in addition to the normal 
editor’s problems of maintaining standards of excellence in the articles published, 
there have been additional difficulties and delays arising from paper and man- 
power shortages in printing. 

In closing this section it is a pleasure to record our appreciation of the as- 
sistance and advice received at Various times during the year from Mr. Lester 
Kellogg, Secretary of the A.S.A.; also from Mr. E. A. Stephens of the Ohio Bell 
Telephone Company in Cleveland in regard to the difficult problems of hotel 
space which arose in connection with the Cleveland meeting in January 1946. 

VI. Election of Fellows 

Acting in consideration of the advice of the Committee on Membership, the 
Board advanced the following members to the grade of Fellow: 

M. S. Bartlett, Cambridge University 
Trygve Haavelmo, The Norwegian Embassy 
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William N, Hurwitz, Bureau of the Census 
John von Neumann, Institute for Advanced Study 

VII. Election op Officers 

The following officeis were duly nominated and elected for 1946: 
President, William G. Cochran 
Vice Presidents, Will Feller 

Edwin G. Olds 


VIII. Committees and Reports op Committees 


Our committees and representatives on joint committees for the year 1946 are 
shown below. The reports of these committees are appended for the information 
of members. It should be borne in mind that committee reports are for con- 
sideration of the Board , they do not commit the Board to any specific action one 
way or another. As already intimated, every member of the Institute may take 
pride in the splendid work of these committees. Like the deliberations of the 
Board, most of the deliberations of the committees were necessarily carried out 
by correspondence because no large meetings were held at which the members 
of any committee or the Board could all be brought together. 

During the year we have been asked by Dean L. P. Eiscnhart, Chairman of 
the Division of Physical Sciences of the National Research Council, to name a 
representative. The Board duly appointed Dean Walter Bartky. The invita- 
tion from Dean Eisenhart to be so represented is a distinct honor and a rccogni- 
tion of the importance of the Institute in pure and applied research . 

We have also been invited to name a representative to the Policy Committee 
for Mathematicians, to which the Board has named Professor Will Feller. On 
the committee are four representatives from the American Mathematical So- 
ciety, one from the Society for Symbolic Logic, and one from the Institute of 
Mathematical Statistics. The Mathematical Association of America has been 
invited to name two representatives. The constitution and purposes of this 
committee are explained in the following paragraphs which are taken from tt 
statement that was approved by the A.M.S. Council on November 23, L945 : 


Representatives of each organization shall be selected in accordance with a plan approved 
by the governing body of that organization, 

mZb e JTih t!iTy 0f A “ Mathematical 8ociet y Bhail be ft non-voting, ex officio 
member of the committee and shall art as secretary for the committee. 

which /rp thl Comra,ttee 8hal] those problems affecting the mathematical profession 

to C ° r : CPrn of th ? WMtituent organizations. It shall lie empowered 

0 rg, 1 T tl0M on mftttere whic)l «>noerr, the position of maths- 

concerning the Wt,-! &S pro j ? 0fse<1 or onacted legislation concerned with science, problems 

other questions which t mathen j atl “ iana °r potential members of our profession, and 

among related stiinces hm &nd the effoctive Potion of mathematics 
ong related sciences, both nationally and internationally. 

already made w “M 1 be Con8trued to affect commitments 

m ernatipnal basis by any of the constituent organizations 



REPORT OP THE PRESIDENT 


101 


(i e., among these is the International Congress of Mathematicians for which an invitation 
was issued by the American Mathematical Society in 1936) . 

This Policy Committee shall be appointed for a period of five years. At the end of that 
time the work of the committee shall be reviewed and a decision made concerning the con- 
tinuation of the committee 

A supplemental motion passed by the A.M.S. Council asks the Policy Com- 
mittee to concern itself primarily with the profession of mathematics and only 
secondarily with the teaching of mathematics. 

W. Edwards Deming, 

President, 1945. 



Committees of the Institute 


Committee 

Development 

Personnel 

William G. Cochran, Chairman 
Paul S. Olmstead, 

Acting Chairman 
Chester L Bliss 

Henry Scheffg 

C. C. Craig 

Frederick Mosteher 

Appendix 

A 

The Teaching of Statistics 

Harold Hotelling, Chairman 
Walter Bartky 

Milton Friedman 

W. Edwards Deming 

B 

Finance 

Paul S. Dwyer, Chairman 
Charles F. Rooa 

Carl Fischer 

A, C, Olahen 

G 

Policy in Regard to Local 
Chapters 

Morris H. Hansen, Chairman 
Gertrude Cox 

Samuel S. Wilks 

1) 

Meetings 

John H. Curtiss, Chairman 

T. Koopmana 

William G. Madow 

E 

Membership 

Joseph L, Doob, Chairman 

Paul S, Dwyer 

T. Koopmana 

Will Feller 

F 

Increasing Subscriptions to 
Libraries and Laboratories 

W, D. Baten, Chairman 

Harold F. Dodge 

Irving W. Burr 

L. Aroian 

G 

Tabulation 

Paul S. Dwyer, Chairman 

Will Feller 

Churchill Eieenhart 
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Nominations 


G. C. Craig, Chairman 
Frederick F. Stephan 
Gertrude Cox 


Revising the Constitution and 
By-Laws 


Morris H. Hansen 
Allen T. Craig 
Chester I. Bliss 
John Curtish 


Representatives to the Inter- John H Curtiss 
Society Committee on Federa- Paul S. Olmstead 
tion 


Representative to the Division Walter Bartky 
of Physical Sciences, National 
Research Council 


Representative to the Policy Will Feller 
Committee for Mathema- 
ticians 


Representative to Explain the W. Edwards Deming 
Need of Mathematical Statis- 
tics in Research for Defense 

Representatives to the Joint Samuel S. Wilks 
Committee for the Develop- Paul R. Rider 
ment of Statistical Applica- 
tions in Engineering and 
Manufacturing 


Appendix A 

Report from the Committee on Development 
I. General 

Continuing the work of the 1944 Committee on Post-War Development, this 
Committee has analyzed the purpose and policy of the Institute to see what 
additional activities the Institute should undertake in order to provide further 
stimulus to the development of the field of mathematical statistics. The fol- 
lowing existing and proposed activities were considered: 

1. Maintenance of professional standards 
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2. Publications program 

3. Meetings program 

4. Rietz Lecture 

5. Chapter policy 

6. Cooperation in determining educational standards 

7. Maintaining relationships with other technical societies 

8. Increasing membership of the Institute 

In general, each of these activities is placed in the hands of a committee. Except 
m a few instances, reports of these committees have not been published in the 
Annals. This- committee recommends that each of the committees of the Insti- 
tute together with the representatives of the Institute on joint committees be 
requested by the Board of Directors to submit a yearly report for possible publica- 
tion in the March issue of the Annals so that the members of the Institute may 
be kept informed concerning the Institute’s affairs. 

II. Professional Standards 

This committee believes that the Report of the Memberahip Committee 
published in the March 1945 issue of the Annals is typical of the kind of report 
desired, providing, as it does, an outline of present standards for membership 
in the Institute. 


III. Publications 

The publication program has been discussed with the Editor anti we find that 
we are in agreement with the present editorial policy. We recommend that the 
Editor submit a yearly report. 

Although an increased membership among those engaged primarily in the 
application of statistics is desirable, it is not considered advisable to alter radically 
the character of the Annals in order to attract such membership. However, 
writers on theoretical topics in the Annals should be encouraged to include illus- 
trations of applications whenever feasible. A desirable goal at which to aim 
would be for every issue of the Annals to contain an expository paper reviewing 
progress in a broad field of theory or devoted to new fields of existing theory 
(these functions are not mutually exclusive). It seems more difficult to obtain 
good papers of this kind than research papers. Now that statisticians are leav- 
ing war work the prospect for obtaining such papers should improve. The 
committee has been informed that the Editor has invited certain writers to 
contribute expository papers on assigned topics and it is recommended that this 
poicy be continued. It is believed that the members of the Institute would 

i e to be informed in the Editor's report concerning progress in receiving such 
papers. 

Last year this committee considered the possibility that the Institute sponsor 
tne publication of a series of books and monographs. In view of recent develop- 
men s in t e commercial publishing field it seems that there is ample opportunity 
or e pu ication of such works as the Institute might otherwise undertake to 
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publish, and the committee therefore recommends against such Institute action 
at this time. 


IY. Meetings 

Under normal conditions of tiansportation, the Institute has held at least two 
meetings each year, one with the mathematical societies in the summer and one 
with the social science societies in the winter. This committee favors the con- 
tinuation of this system Occasionally, meetings have been held with an en- 
gineering society. This program does not provide specifically for joint meetings 
with societies devoted to (a) standardization, (b) engineering, or (c) natural 
sciences Arrangements for meetings under (a) and (b) could be made through 
our representatives on the Joint Committee for the Development of Statistical 
Applications in Engineering and Manufacturing, which has representation from 
each of these groups. This committee recommends that the Program Committee 
have on its membership one of the Institute’s representatives on the Joint Com- 
mittee and one who is active in the natural sciences. Important duties of these 
members are to give advice on the type of program desned for joint meetings 
in these applied fields and to make arrangements for the meetings. It is also 
recommended that the Program Committee include Institute members who are 
active in the mathematical societies and in the social science societies so that our 
participation in meetings with these groups will be integral to their piogiams. 
Other members of the Program Committee may be chosen with similar aims in 
mind, The yearly report of the Program Committee should discuss among 
other matters the progress made in arranging joint meetings. 

V Rietz Lecture 

To direct attention to the work of the Institute, it is recommended that the 
Institute sponsor an annual lecture of broad interest, to be named after its first 
president, the late Professor Henry L Rietz. It is suggested that the lecturer 
be appointed by the Board of Directors, that he be given a year’s notice, and that 
the lecture be arranged for a meeting with an appropriate society, 

VI. Chapters 

In establishing chapters, the Institute has undertaken obligations that to date 
have not been fulfilled. Two courses are open. Either the Institute should 
abolish its existing chapters or it should formulate a policy that will provide for a 
vigorous chapter program. Some requirements for chapters have been set down 
by the Committee on Policy with Regard to Local Chapters (Appendix D). 
It is proposed that this be submitted to the secretaries of our chapters for their 
comments. Further, certain broader aspects of the problem require additional 
consideration. Discussion with various members of the Institute indicates 
that some believe that the interests of the Institute because of its relatively 
small membership might be better served by organizing geographical sections 
rather than chapters. Pending final agreement on these points, this committee 
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reco mmen ds that the Board of Directors hold in abeyance any requests for the 
formation of new chapters. 

VII. Educational Standards 

The matter of educational standards for college courses is now in the hands 
of the C ommi ttee on the Teaching of Statistics. Such a committee should he a 
permanent committee of the Institute. 

It is our further recommendation that one member of this committee be one 
of the representatives of the Institute on the Joint Committee for the Develop- 
ment of Statistical Applications in Engineering and Manufacturing. It should 
be his duty to assess needs for statistics courses, particularly in relation to stand- 
ardization and engineering. 

VIII. Relationships with Other Technical Societies 

In 1929, the Joint Committee for the Development of Statistical Applications 
in Engineering and Manufacturing was formed. The Institute him had two 
representatives since 1937. The other sponsor societies for the Joint Com- 
mittee are: 

American Society of Mechanical Engineers 
American Society for Testing Materials 
American Statistical Association 
American Mathematical Society 
American Institute of Electrical Engineers 

Much of the use of statistical method in the war effort is traceable directly to the 
activity of this committee. In particular, this committee is working continu- 
ously to see that statistical methods and statistical concepts are introduced in 
connection with work on standardization, engineering, and the natural and social 
sciences. In a report published in the December 1940 issue of the Annals, the 
Institute’s War Preparedness Committee made the following recommendations : 

The Institute should “cooperate to the fullest in matters pertaining to quality control 
and specification with the ‘Joint Committee for the Development of Statistical Applica- 
tions in Engineering and Manufacturing,' of which the Institute is a sponsor.” 

Six specific steps for a cooperative program with the Joint Committee were 
outlined. However, although this report was accepted by the Board, no action 
was taken on these recommendations. In view of the above, we make the 
following recommendations to the Board: 

1. That the Institute’s representatives be requested to make ft report on the nativities 
of the Joint Committee. (This should be the first of ft series of yearly reports.) 

That the Board request a report from the Joint Committee on the status of stalls lira 
and statisticians in engineering and manufacturing including foroctusU of future needs 
and opportunities. 

3. That the Board request a report from the Joint Committee on the status of statistics 
in the training of engineers including recommendations for such training in the future. 

4. That at least one of the Institute’s representatives be from the engineering or manu- 
facturing field. 
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IX. Growth op the Institute 

The Committee on Development has examined the record of growth of the 
Institute and finds that the largest increase in recent years has been among 
people from industry, a group that is still less than a quarter of the total mem- 
bership. It is believed that the program outlined above will stimulate growth 
m membership among all users and potential users of mathematical statistics. 

X. Publicizing Mathematical Statistics 

This Committee recommends that the Institute make available to appropriate 
channels of public information reliable communications concerning mathematical 
statistics. As a specific recommendation, the case for the science of statistics 
should be presented at the hearings of the National Research (Science) Founda- 
tion Acts pending in Congress, preferably by representatives acting jointly for 
the Institute and the American Statistical Association. 

XI. The Intehsociety Committee 

A second meeting of the Intersociety Committee mentioned in last year’s 
report is to be held on December 8th. This Committee feels that consideration 
of proposals for reorganization of the Institute should not be undertaken prior 
to advice concerning the action of that Committee. 

W. G. Cochran, Chairman P. S. Olmstead, Acting Chairman 

C. I. Bliss C. C Craig 

F. C. MoSTELLER H. ScHEFPf) 

November 5, 1945 


Appendix B 

Report from the Committee on the Teaching of Statistics 

A preliminary draft of recommendations in the teaching of statistics was read 
by the chairman of this committee at the, Rutgers Meeting at the Institute in 
September 1945. These recommendations are at present being re-drafted by 
members of the Committee and it is hoped that they would be ready to present 
to the Board in the near future for possible publication in the Annals. 

Assistance was rendered during the first part of the year to the National Roster 
of Scientific and Specialized Personnel, toward the development of a formal 
description of the profession of statistics (mentioned in Part I of the Annual 
Report of the President). This assistance was carried out jointly with Dr. 
Chester I, Bliss who was appointed by the American Statistics Association to 
assist with this project. It is believed by this Committee that the description 
put forth by the Roster will help bring about recognition of standards of pro- 
fessional competence in statistics and in the teaching of statistics. 

Harold Hotelling, Chairman 
Walter Bartky 
Milton Friedman 
W. Edwards Deming 
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Appendix C 


Report from the Committee on Finance 


"the Committee on Finance met in the office of Dr. C. F. Roos in New York 
City on September 14, 1945. Present were Messrs. Roos, C. H. Fischer, P. S. 
Dwyer; absent, A. C. Olshen. 

The Treasurer presented a summary of income and expenses during the third 
quarter of 1945 through September 13. This information was considered along 
with the first half year reports which were prepared some months ago. The 
Treasurer also presented a graph showing balance on hand at the end of each 
month (1939-1945) and one showing income during each month (1939-1945). 
These facts, as well as other pertinent information, were used in formulating the 
recommendations which follow. 

The'Finance Committee proposes to the Board of Directors that the following 
recommendations be approved by the Board as policy for the Institute of Mathe- 
matical Statistics. 

1. That no revision be made with reference to the adoption of the expected 
budget for 1945. It appears now that the income will be somewhat higher than 
the amount indicated on the expected budget ($6450) and that the amount of 
expense should he somewhat lower the amount there estimated ($0050). 

2. That the Secretary-Treasurer be instructed to prepare an Annual statement 
for 1945 on the general plan of previous annual statements with the addition of 
an analysis of assets and liabilities. The main assets are cash, bonds, and back 
issues of the Annals It is recommended that the back issues be valued at 75 
cents per copy (for inventory purpose)— a fair estimate of cost. It is further 
recommended that no value be placed on exchanges and office equipment. 

3. That the Secretary-Treasurer prepare the annual statement prior to the 
winter meeting, which means presumably that the books will be closed about 
December 10th. 


4. That, in consideration of the nature of the graph of the income of the 
Institute, the Institute adopt the policy of having its yearly report run from 
July 1 to July 1 and that the Secretary-Treasurer be instructed to draw up an 
additional annual report as of June 30, 1946. 

5. That the Secretary-Treasurer be instructed to draw up a budget for 194G 
and to submit it to the Finance Committee in sufficient time so that action may 
be taken on it by the Board at its winter meeting. 

6 That the U. S Government G Bonds now owned by the Institute ($3000) 
be listed on the books at their face values even though the market values of those 
bonds are sbghtly lower, 

7 That the total amounts of all life membership payments be placed in a 

X J “T ei and that these funds - at least twice a year, be used 

In h P d li aSe U S '• Government F B °nds, The market value of these bonds 

8 ? T “““* the am0Unt of this W at a *y accounting period. 

• at the Secretary-Treasurer be authorized to take whatever steps are 
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necessary to obtain adequate interest on our liquid assets. That he maintain 
sufficient cash position to carry on the business transactions of the Institute and 
that he invest the remainder (a) either in U. S. Government G bonds or (b) in 
short term bonds. 

9. That the purchase from Professor Carver of all back issues jointly owned by 
Professor Carver and the Institute be made an item of the budget for 1946. 

10. That the Secretary-Treasurer be instructed to purchase a $2,000 fidelity 
Bond Form B (a form which covers negligence as well as dishonesty) for 3 years 
for the office of Secretary-Treasurer. 

11. That a policy be adopted of allowing a straight 10% discount to all agencies 
and booksellers who send us subscriptions or orders for back issues. 

12 That the Institute set up a permanent Committee on Finance with the 
Secretary-Treasurer as ex-officio member and chairman. There shall be three 
additional members with terms of three years with a new member each year. 
At the formation of the Committee one member shall be appointed for one year, 
one for two years, and one for three years. A resignation from the Committee 
shall be followed by an appointment for the unexpired term. 

13. That the Board notify any committee working on revision of the Constitu- 
tion and By-Laws that it is supporting a permanent committee on Finance and 
believes it appropriate that a statement of the organization and duties of thiB 
committee should appear in the By-Laws. 

Paul S. Dwyer, Chairman 
Carl H. Fischer 
Abraham C. Olshen 
Charles F. Roos 

September 15, 1945 


Appendix D 

Report from the Committee on Policy with Regard to Local Chapters 

Attached to this report is a summary of provisions for organizing and working 
with local chapters; it might be cast into appropriate form and incorporated into 
the Constitution of the Institute From these recommended provisions it will 
he clear that this committee does not favor the organization of weak inactive 
chapters. Unless the membership of the Institute grows substantially it will 
be possible to have only a very limited number of local chapters under these 
provisions 

It is the opinion of the Committee that it is desirable for members of the In- 
stitute to amalgamate with members of other statistical organizations in the same 
area to form local statistical societies. We believe this will build stronger local 
statistical organizations and will effect greater advances in the application and 
development of effective statistical methods. Such amalgamation in the formu- 
lation of local societies can best be stimulated, and national leadership provided, 
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after the national statistical organizations have accomplished a federation or 
amalgamation. We therefore urge the Institute to use its influence in stimulat- 
ing discussion and action concerning national federation or amalgamation. 

The following further comments are made in addition to or supplementing 
those provisions recommended for incorporation into the Constitution of the 
Institute: 

1. Do not accept or reject the petition from any group until a plan of organiza- 
tion is formulated. There should be clearance on the following questions: 

a. What are the reciprocal responsibilities of chapters and the parent 
organization? What type of chapter activity should the Institute 
seek to promote? What kind of things can chapters do that will 
advance the purposes for which the Institute exists? 

We have indicated m the recommended provisions that the Presi- 
dent of the Institute should personally undertake or designate someone 
to work with the chapters in answering these and similar questions, 
b If local chapters are not active will they hinder the efforts of the parent 
organization? We believe that the existence of an inactive organiza- 
tion is a detriment to development of an active statistical group in a 
community. Activity can be measured in various ways: 

a, Meetings for research in mathematical statistics 

b. Joint meetings with other professions 

c. Bringing in new members to the parent organization 

d, Annual election of officers 

1. If members of a chapter must be members of the parent organization, the 
Secretary -Treasurer of the Institute should notify the secretary of a local 
chapter whenever a new member joins within his area. 

3. It is recommended that if a local chapter desires it, bills for Institute dues 
contain provision for collection of local dues. 

4. The Institute should not allow any local group to use its name unless the 
group contributes to the accomplishment of the aims of the Institute. 

Morris H. Hansen, Chairman 
Gertrude Cox 
Samuel S. Wilks 


Suggested Article on Local Chapters for addition to the Constitution 

1. Local chapters of the Institute of Mathematical Statistics may be orgftn- 

who arp^eTd ° f the a local organization of members 

who are resident within a given limited territory. 

2, The members of the local chapter shall be members of the Institute. 

' f °cal chapter may be established upon acceptance by the Board of Direc- 

oi the *—• -*■« 

mdmTZl fZ °® c “ 8 ' committees, ess,*, 
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5. The affairs of local chapters shall he in general charge of the President of the 
Institute or a representative assigned by him. to be responsible for local chapters, 
under the Direction of the Board of Directors. 

6. Any local chapter will be dissolved by: 

(a) failing for two successive years to maintain a paid membership of at 
least 25 members or to hold at least one meeting per year which shall include 
election of officers; or 

(b) by vote of the Board of Directors of the Institute 

7. Each local chapter shall transmit a report to the Secretary-Treasurer of the 
Institute within 30 days of the annual business meeting, reporting among other 
things, on its officers, the number of members, and on the meetings held during 
the year. 


Appendix E 

Report from the Committee on Meetings 

A meeting was held at Rutgers University on Sunday Sept. 16, which was 
attended by 115 members of the Institute. Simultaneously a meeting was held 
by the American Mathematical Society. The first session, which commenced 
at 10 a.m was a symposium on sequential analysis. The chairman was Professor 
W. Allen Wallis of Stanford University and Director of the Statistical Research 
Group at Columbia University, The speakers and their titles are listed below. 

1 Theory of sequential analysts. 

Professor A. Wald, Columbia University 

2. Construction of multiple sampling inspection plans for attributes from sequential prin- 
ciples 

Dr. Milton Friedman, National Bureau of Economic Research and the Statistical 
Research Group 

3 Applications of sequential analysis to the ranking of two populations with respect to a 
single parameter. 

Mr. Meyer A. Girshick, Bureau of Agricultural Economics and the Statistical Re- 
search Group 

The afternoon session was a series of contributed papers, followed by a pre- 
liminary report from the Institute’s Committee on the Teaching of Statistics, 
which was delivered by Professor Harold Hotelling. Dr. W. Edwards Deming, 
President of the Institute, was chairman of this meeting. The list of contributed 
papers follows hereunder. 

1. On the variance of a random set in n dimensions. 

Dr. Herbert E. Robbins, The Post Graduate School, Annapolis 

2. The non-central Wisharl distribution and its application to problems in multivariate 
analysis. 

Dr T. W Anderson, Jr., Princeton University 

3. The effect on a distribution function of small changes in the population f unction. 
Professor Burton H. Camp, Wesleyan University 

4. On composite distributions. 

Dr. Casper Goffman. and Dr Benjamin Epstein, Westinghouse Electric and Manu- 
facturing Company 
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5 Population , expected values, and sample 

Professor Emil J. Gumbel, New School for Social Research 
6. On t,he selection of a sample in repealed steps 
Dr. William G Madow, Bureau of the Census 
7 On optimum estimates for stratified samples (Presented by Margaret Gurney, Bureau 
of the Census) 

Mr. Morris H. Hansen and Mr, William N. Ilursvitz, Bureau of the Census 
8. Pearsonian correlation coefficients associated with least squares theory. (Presented by 
title) 

Professor Paul S. Dwyer, University of Michigan 

At this wilting preparations are being made for a meeting to lie held in Cleve- 
land, January 24-27, 1946, and for a meeting with the A.A.A.S. to be held in 
St. Louis, March 27-30. 

John H. Curtiss, Chairman 
T. Koopmans 
William G. Madow 


Appendix F 

Report from the Committee on Membership 

The Committee, after study and consideration, recommended to the Board of 
Directors that Messrs. M. S. Bartlett, T. Haavelmo, William N. Hurwitz, and 
John von Neumann be advanced to the grade of Fellow. This recommendation 
was approved by the Board. 

The Committee, with the advice and approval of the Board is preparing a 
letter to be sent to groups of people who are not members of the Institute to call 
their attention to the work of the Institute. This letter will be accompanied by 
reprints of a recent paper by Wald and Wolfowitz on Sampling inspection-plans 
for continuous production, with a brief explanation of the field covered by the 
Wald-Wolfowitz paper, and the statement that it and others that have appeared 
m Tecent issues of the Annals have already modified statistical practice in im- 
portant ways. 

Joseph L. Doob, Chairman 
Paul S. Dwyer 
T. Koopmans 
Will Feller 


Appendix G 

Report from the Committee for Increasing Subscriptions to Libraries and 

Laboratories 

This committee prepared suitable literature to send to prospective subscribers. 
This literature contained a concise description of the nature of the Annals, a 
table of contents for a year, and a subscription blank. 
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Alphabetical lists of public, college, university and industrial libraries were 
prepared, These lists contained the name, the librarian, and the address of each 
library They were checked for duplicates for present subscribers and sent to 
Professor Dwyer, Secretary-Treasurer. Altogether, the list contained about 
1500 libraries. 

Professor Dwyer took care of printing the literature, further checking for 
duplicates, addressing the envelopes, and mailing. 

William Dowell Baten, Chairman 
Harold F. Dodge 
Irving W. Btirr 
L. Aroian 



ANNUAL REPORT OF THE SECRETARY-TREASURER OF THE 

INSTITUTE 


(For 1945) 

Accounts of the Rutgers meeting of the Institute appeared in the September 
issue of the Annuls. Notices of meetings of the Washington Chapter have been 
sent out from the office of the Secretary-Treasurer. 

Due to a large extent to activity of the members, the Institute has enjoyed a 
large increase in memberhip during the year. The G06 members of a year ago 
have increased to 777. This is an increase of over 28%. 

The Secretary-Treasurer wishes to acknowledge the continued assistance of 
Professor Lloyd Knowler in looking after the back issues of the Aimak which 
are stored at Iowa City. 

The following financial statement is drawn up along lines specified by the 
Finance Committee and the Board of Directors. It covers the period December 
31, 1944 to December 31, 1945. 


FINANCIAL STATEMENT 
December 31, 1944, to December 31, 1945 
A. Receipts 

Balance on Hand, December 13, 1944 $9,700.66 

Dots . . 4,108,40 

Life Membership Payments 885.00 

Subscriptions 1,513.73 

Sale of Back Numbers 1 737.46 

Income from Investments 75,00 

Miscellaneous on 


Total 


Annals— Current 
Office of Editor. 
Waverly Press . 


B. Expenditures 


$15,I12i44 


$400.00 

4,066.42 


Annals — Back Numbers 

Purchase from H. C. Carver 

Reprinted 300 copies 

Vol. I N,o 2, Vol. II No. 1, Vol. IX No. 1. 
Iowa City Office 


$4,466.42 

280.61 

727,60 

46.00 


Office of President, 

Mathematical Reviews , 

Office of the Secretary-Treasurer 

Printing, Mimeographing, programs, etc. 
envelopes) 


1,063.01 

130,26 

100,00 

(including stamped 
764.6 8 
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Postage and supplies , , 166 25 

Clerical help , 852 95 

1,773.78 

Miscellaneous . , SO 76 

Balance on Hand, December 31, 1945 (Cash and Bonds) 7,548.22 


$16,112.44 

C. Summaby op Receipts and Expenditures 

Balance on Hand,* December 31, 1944 $6,790.65 

Receipts during 1945 8,321 79 

Expenditures during 1946 7,664.22 

Balance on Hand,* December31, 1946 7,648.22 

Net Excess op Receipts over Expenditures, 1945 . . 757 67 

D Comparison of Assets on December 31, 1944 and December 31, 1945 

m ms 


US Government G Bonds . 

$3,000.00 

$6,000 00 

Life Membership Funds 

330 00 Bank 

/888.00F Bonds 
(327.00 Bank Dep, 

Additional Bank Deposits ... 

3,460.65 

333 22 

Current Accounts Receivable 

303.73 

255.35 

Estimated Value (Cost)** 



Of back issues of Annals 



At Iowa City 

4,210.26 

3,825 75 

At Ann Arbor 

567.00 

1,242.80 

Deduct Estimated Value of issues owned by II, C. 



Carver 

879 60 

570.00 

Total 

$11,001.03 

12,301.62 

Net Gain 1945 

> . 

1,300.49 


E, Liabilities op Institute of Mathematical Statistics as of December 31, 1946 

All bills which have been presented have been paid and there are no outstanding ac- 
counts against the Institute of appreciable size. The $1215 in Life Membership payments 
require the Institute to provide the privileges of membership for life for tho 17 members 
who have made payments. About $2600 should be credited to 1946 dues and subscriptions 

PAUL S. DWYER 

* Secretary-Treasurer. 

December 31, 1945 

* In form of bank deposit and government bonds. 

** Value of Annals calculated at 75 cents per copy. All 1944 figures and 1945 Ann Arbor 
figures based on physical inventory. 1945 Iowa City figures based on book inventory. 



ANNUAL REPORT OF THE EDITOR 

(For 1045) 

In spite of the war, enough papers in mathematical statistics have been 
proposed for publication in the Annals in 1945 to keep the total volume of ma- 
terial at approximately 450 pages, the level which has been maintained during 
the last few years, A total of 40 papers were published i n t he 1 94 5 v olume of the 
Annals of which 14 were short notes published in the “notes'* section. The 
outlook for a sufficient number of acceptable papers to maintain the usual volume 
of publication during 1946 looks quite favorable, Many mathematical statis- 
ticians who were engaged in war work are now free to resume their research. 
In some cases statistical theory developed in connection with classified war 
research projects can be expected to be declassified in the near future and made 
available for open publication. 

Most of the material which has been published in the /Innate consists of original 
research or extensions of work already published in mathematical statistics as 
contrasted with material of an expository character. In view of the considerable 
number of newcomers into the Institute, as well ns a general increase of interest 
in probability and statistics during recent years, it would be highly desirable to 
publish more expository or survey material. Invitations have been accepted by 
several individuals to prepare expository articles, but they have been, so heavily 
burdened with extra work during the war that they have been unable to complete 
their tasks. It is hoped that circumstances will now permit the preparation of 
expository articles. 

On behalf of the Editorial Committee for the A/mate, the Editor takes this 
opportunity to acknowledge with thanks the refereeing assistance which lias 
been received from the following individuals during 1945: R, L. Anderson, T. W, 
Anderson, George W. Brown, A. H. Copeland, W. J. Dixon, J. L. Doob, Milton 
Friedman, M. A. Girshick, M. Kac, T. ICoopmans, Carl Kossack, D, H. Lehmer, 
II. B. Mann, P. J. McCarthy, F, C. Mosteller, H. E. Robbins, J. W. Tukey, 
W. A. Wallis, J. D. Williams, and C, P. Winsor. The Editor is also indebted to 
the following individuals at Princeton University for preparation of manu- 
scripts for the printer, and other editorial assistance from time to time in con- 
nection with the Annals : Mrs. Gladys B, Huling, Luis F. Nanni, Mrs. Euthie 
Ross, Mrs. Eleanor C, Schoenly, and John E. Walsh, 


December 31, 1945 


S. S, Wilks 
Editor 
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CONSTITUTION 

OF THE 

INSTITUTE OF MATHEMATICAL STATISTICS 

ARTICLE I 
Name and Pubposb 

1. This organization shall be known as the Institute of Mathematical Statistics. 

2. Its object shall be to promote the interests of mathematical statistics. 

ARTICLE II 

Membership 

1. The membership of the Institute shall consist of Members, Fellows, Honorary 
Members, and Sustaining Members. 

2. Voting members of the Institute shall be (a) the Fellows, and (b) all otherB, Junior 
members excepted, who have been members for twenty-three months prior to the date 
of voting. 

3. No person slmll be a Junior Member of the Institute for more than a limited term as 
determined by the Committee on Membership and approved by the Board of Directors. 

ARTICLE III 

Officers, Board of Directors, and Committee on Membership 

1. The Officers of the Institute shall be a President, two Vice-Presidents, and a Secret 
tary-Treasurer. The terms of office of the President and Vice-Presidents shall be one 
year and that of the Secretary-Treasurer three years. Elections shall be by majority 
ballots at Annual Meetings of the Institute. Voting may be in person or by mail. 

(a) Exception. The first group of Officers shall be elected by a majority vote of the 
individuals present at the organization meeting, and shall serve until December 31, 1936. 

2. The Board of Directors of the Institute shall consist of the Officers, the two previous 
Presidents, and the Editor of the Official Journal of the Institute. 

3 The Institute shall have a Committee on Membership composed of a Chairman and 
thffie Fellows. At their first meeting subsequent to the adoption of this Constitution, the 
Board of Directors shall elect three members as Fellows to serve as the Committee on 
Membership, one member of the Committee for a term of one year, another for a term of 
two years, and another for a term of three years. Thereafter the Board of Directors shall 
elect from among the Fellows one member annually at their first meeting after their elec- 
tion for a term of three years. The president Bhall designate one of the Vice-Presidents as 
Chairman of this Committee. 


ARTICLE IV 
Meetings 

1. A meeting for the presentation and discussion of papers, for the election of Officers, 
and for the transaction of other business of the Institute Bhall be held annually at such 
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time as the Board of Directors may designate, Additional meetings may be called from 
time to time by the Board of Directors and shall be called at any time by the President 
upon written request from ten Fellows. Notice of the time and place of meeting shall be 
given to the membership by the Secretary-Treasurer at least thirty days prior to the 
date set for the meeting. All meetings except executive sessions shall lie open to the 
public. Only papers accepted by a Program Committee appointed by the President may 
be presented to the Institute. 

2. The Board of Directors shall hold a meeting immediately after their election and 
again immediately before the expiration of their term. Other meetings of the Board 
may be held from time to time at the call of the President or any two members of the 
Board. Notice of each meeting of the Board, other than the two regular meetings 
together with a statement of the business to he brought before the meeting, must be 
given to the members of the Board by the Secretary-Treasurer at least five days prior to 
the date set therefor. Should other business be passed upon, any member of the Board 
shall have the right to reopen the question at the next meeting. 

3. Meetings of the Committee on Membership may be held from time to time at the call 
of the Chairman or any member of the Committee provided notice of such call and the 
purpose of the meeting is given to the members of the Committee by the Secretary- 
Treasurer at least five days before the date set therefor. Should other business be passed 
upon, any member of the Committee shall have the right to reopen the question at the 
next meeting. Committee business may also be transacted by correspondence if that 
seems preferable. 

4. At a regularly convened meeting of the Board of Directors, four members shall 
constitute a quorum. At a regularly convened meeting of the Committee on Member- 
ship, two members shall constitute a quorum, 


ARTICLE V 

Publications 

1. The Annals of Mathematical Statistics shall be the Official Journal for the Institute. 
The Editor of the Annals of Mathematical Statistics shall be a Fellow appointed by the 
Board of Directors of the Institute. The term of office of the Editor may be terminated 
at the discretion of the Board of Directors. 

2. Other publications may be originated by the Board of Directors as occasion arises. 


AAXLUbl!i Vi 

Expulsion on Suspension 

action E ef £ tT ° f d T’ D0 0116 shaU be ex ^ lled 01 suspended except by 

action of the Board of Directors with not more than one negative vote. 

ARTICLE VII 
Amendments 

larto It' 3 con f ituti ° n he amsnded by an affirmative two-thirds vote at any regu- 
2 S ° !* r"“ “to °f <«* proposed .JSnt 

Zit ZtSS* V T‘ V the toretery-Tressurer «t i.ut thirty 

pr '”»“ 1 » to >» “W ”PP«- V.4 
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ARTICLE I 

Duties of the Officers, the Editor, Board of Directors, and 
Committee on Membership 

1, The President, or in his absence, one of the Vice-Presidents, or in the absence of the 
President and both Vice-Presidents, a Fellow selected by vote of the Fellows present, 
shall preside at the meetings of the Institute and of the Board of Directors. At meetings 
of the Institute, the presiding officer shall vote only in the case of a tie, but at meetings 
of the Board of Directors he may vote in all cases. At least three months before the date 
of the annual meeting, the President shall appoint a Nominating Committee of three 
members. It shall be the duty of the Nominating Committee to make nominations for 
Officers to lie elected at the annual meeting and the Secretary-Treasurer shall notify all 
voting members at least thirty days before the annual meeting. Additional nomina- 
tions may bo submitted in writing, if signed by at least ten Fellows of the Institute, up to 
the time of the meeting. 

2, The Secretary-Treasurer shall keep a full and accurate record of the proceedings 
at the meetings of the Institute and of the Board of Directors, send out calls for said 
meetings and, with the approval of the President and the Board, carry on the corre- 
spondence of the Institute. Subject to the direction of the Board, he shall have charge 
of the archives and other tangible and intangible property of the Institute and once ft year 
he shall publish in the Annals of Mathematical Statistic s a classified list of all Members and 
Fellows of the Institu tc. He shall send out calls for annual dues and acknowledge receipt 
of same; pay all bills approved by the President for expenditures authorized by the Board 
or the Institute; keep a detailed account of all receipts and expenditures, prepare a finan- 
cial statement at the end of each year and present an abstract of the same at the annual 
meeting of the Institute after it has been audited by a Member or Fellow of the Institute 
appointed by the President as Auditor. The Auditor shall report to the President. 

3. Subject to the direction of the Board, the Editor shall be charged with the responsi- 
bility for all editorial matters concerning the editing of the Annals of Mathematical Sta- 
tistics He shall, with the advice and consent of the Board, appoint an Editorial Commit- 
tee of not less than twelve members to co-operate with him; four for a period of five years, 
four for a period of three years, and the remaining members for a period of two years, ap- 
pointments to be made annually as needed, All appointments to the Editorial Com- 
mittee shall terminate with the appointment of a new Editor. The Editor shall serve as 
editorial adviser in the publication of all scientific monographs and pamphlets authorized 
by the Board. 

4. The Board of Directors shall have charge of the funds and of the affairs of the 
Institute, with the exception of those affairs specifically assigned to the President or to 
the Committee on Membership. The Board shall have authority to fill all vacancies 
ad interim, occurring among the Officers, Board of Directors, or in any of the Committees. 
The Board may appoint such other committees as may be required from time to time 
to carry on the affairs of the Institute. The power of election to the different grades of 
Membership, except the grades of Member and Junior Member, shall reside in the Board. 

5 The Committee on Membership shall prepare and make available through the 
Secretary-Treasurer an announcement indicating the qualifications requisite for the 
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different grades of membership. The Committee shall review these qualifications period- 
ically and shall make such changes in these qualifications and make such recommendations 
with reference to the number of grades of membership as it deems advisable. Tile power 
to elect worthy applicants to the grades of Member and Junior Member shall reside in the 
Committee, which may delegate this power to the Seoretary-Treasurer, subject to such 
reservations as the Committee considers appropriate. The Committee shall make recom- 
mendations to the Board of Directors with reference to placing members in other grades 
of membership. The Committee shall give its attention to the question of increasing the 
number of applicants for membership and shall advise the Secretary-Treasurer on plans 
for that purpose. 


ARTICLE II 
Dues 

1. Members shall pay five dollars at the time of admission to membership and shall 
receive the full current volume of the Official Journal. Thereafter, Members shall pay 
five dollars annual dues. The annual dues of Junior Members shall be two dollars and 
fifty cents 

The annual dues of Fellows shall be five dollars. The annual dues of Sustaining 
Members shall be fifty dollars Honorary Members shall be exempt from all dues. 

(a) Exception, In the case that two Members of the Intituto are husband and wife 
and they elect to receive between them only one copy of the Official Journal, the annual 
dues of each shall be three dollars and seventy-five cents. 

(b) Exception Any Member or Fellow may make a single payment which will be 
accepted by the Institute in place of all succeeding yearly dues and which will not other- 
wise alter his status as a Member or Fellow. The amount of this payment will depond 
upon the age of this Member or Fellow and will be based upon a suitable table and rate of 
interest, to be specified by the Board of Directors. 

(c) Exception. Any Member or Junior Member of the Institute serving, except as a 
commissioned officer, in the Armed Forces of the United States or of one of its allies, may 
upon notification to the Secretary-Treasurer be excused from the payment of dues until the 
January first following his discharge from the Service He shall have all privileges of 
membership except that he shall not receive the Official Journal. However during the 
first year of his resumed regular membership he may have the right to purchase, at 82.60 
per volume, one copy of each volume of the Official Journal published during the period 
of his service membership. 

2 Annual dues shall be payable on the first day of January of each year. 

3 The annual dues of a Fellow, Member, or Junior Member include a subscription to 
the Official Journal. The annual dues of a Sustaining Member include two subserm' inns 
to the Official Journal, 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone whui : io\ 
may be six months in arrears, and to accompany such notice by a copy of this Mjmef 
If such person fail to pay such dues within three months from the date of mailing® 
notice, the Secretary-Treasurer shall report the delinquent one to the Board o'JOir jjjlj 
by whom the person’s name may be stricken from the rolls and all privileges § Miffll 
ship withdrawn. Such person may, however, be re-instated by the Board 1 < SiwJ \ 
upon payment of the arrears of dues. 


BY-LAWS 


121 


ARTICLE III 
Salaries 

1. The Institute shall not pay a salary to any Officer, Director, or member of any 
committee. 


ARTICLE IV 
Amendments 

1. These By-Laws may be amended in the same manner as the Constitution or by a 
majority vote at any regularly convened meeting of the Institute, if the proposed amend- 
ment has been previously approved by the Board of Directors. 




CONTRIBUTIONS TO THE THEORY OF SEQUENTIAL ANALYSIS. I 


By M. A. Gihshick 
United Stales Department of Agriculture 


PART I Applications of Sequential Analysis to the Ranking of Two 
Populations With Respect to a Single Parameter. 


1. Summary. Given two populations th, and W 2 each characterized by a dis- 
tribution density fix, 6 ) which is assumed to be known, except for the value of 
the parameter 0, It is desired to test the composite hypothesis 61 < 0 2 against 
the alternative hypothesis ft > ft where 0, is the value of the parameter in the 
distribution density of n , (i — 1, 2). 

The criterion proposed for testing this hypothesis is based on the sequential 
probability ratio and consists of the following: 

Choose two positive constants a and b and two values of 8 , say 0? and 0°. 

Take pairs of observations from n and x 2a from tt* , (a - 1, 2, . , ), in sequence 
) 

and compute Zj — X) where 

a*«l 


2« = log 


" /(foot, 0l)/(Sln, 0°) ~ 

_/(foa , 0-D /(foa ) 0?)_ 


The hypothesis tested is accepted or rejected depending on whether Z n > a or 
Z n < — b where n is the smallest integer j for which either one of these relation- 
ships is satisfied, 

The boundaries a and b are partly given in terms of the desired risks of making 
an erroneous decision. The values 0? and 0° define the magnitude of the differ- 
ence between the values of 8 in nr and in n which is considered worth detecting. 
It is shown that the power of this test is constant on a curve h((h , 0 2 ) = constant. 

j(x, eir 


If E 


V s fix, 81 ) ) 


is a monotonic function of 0, then the test is unbiased in the 


'fix, 81); 

sense that all points (0i , 82 ) which lie on the curve h( 8 i , 0 2 ) = constant are such 
that either every 0 1 < 0 2 or every 0i > 0 2 . For a large class of known distribu- 
tions the quantity h is shown to be an appropriate measure of the difference 
between 0i and 0 2 and the test procedure for this class of distributions is simple 
and intuitively sensible. 

For the case of the binomial, the exact power of this test as well as the distribu- 
tion of n is given. 


1,1 General discussion. Consider two processes (populations) m and x 2 
each yielding a measurable quantity x whose distribution density f(x, 0) is as- 
sumed to be known except for the value of the parameter 0. On the basis of a 
random sample obtained from each, it is desired to choose that process which 
yields the smaller (or larger) 0. That is, it is desired to devise a test which will 
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result in a high probability of accepting ri if the 0 characterizing its distribution 
density is smaller (or larger) than the 9 in wa , a high probability of rejecting tt, 
(i.e. accepting n) when the opposite is true, and approximately equal probability 
of making one or the other decision if the value of 0 in m is the same as in . 

As an illustration of the type of problem here considered, let us assume that a 
manufacturer is faced with a choice between two competing processes of pro- 
duction, each process yielding an unknown fraction defective p and each entail- 
ing about the same operating cost. Based on the evidence of a random sample 
selected from each, the manufacturer wishes to choose that process which yields 
the smaller fraction defective. If the fractions defective in the two processes 
differ by a significant amount, he will want a test which guarantees a high prob- 
ability of making a correct decision. If, however, the fraction defective in the 
two processes are of approximately the same magnitude, it will be a matter of 
indifference to him which decision is reached. 

The solution given in this paper to the above problem is based on Wald’s 
sequential probability ratio test [1]. The resulting procedure not only requires 
on the average, fewer observations for the same protection than any other test 
(which is always the case with sequential tests of this type) but is also direct and 
simple when applied to a large class of distributions commonly met in practice. 

1.2 Derivation of the sequential test when, the existence of a priori probabili- 
ties is assumed. The choice of the probability ratio as a method of discrim- 
inating between the two processes is suggested by considerations of a priori 
probabilities. Let us assume that each process may have either 0? or 0° as the 
value of a parameter 6 in its distribution density and that the value fl? is more 
desirable than 0 2 , Let us further assume that there exists an a priori probability 
fh that a process will have fl? as a parameter and an a priori probability g% ** 1 — 
0i that it will have 0 2 as a parameter. Let the likelihood for n observations 
‘ > *i» drawn from m be designated by p(xu, Zn, • • ■ , x lH , 0?) when 

0i is the parameter in ir x , and by p(x n , Xu , • • • , Xi» , ol) when 0? is the parameter 
m 7 ti . Let the likelihoods p(x n , x a , • • • , , 0?) and p(x*i , Xn , • • * , **„ , 0$) 

be similarly defined for n observations xj, , x^ , - ■ ■ , x ln drawn from -n % . Then 

(1.201) p(x., , *„ , • ■ ,*<„, flj) = f[ /(*,„ , 0»), f, j = 1,2. 

a “I 

Let fa, (i, j = 1, 2), be the a posteriori probability that having obtained (a =» 

l, 2, - ■ , n), that process ? u has 0* as a parameter in its distribution density. 
Then 

(1.202) ft, ffipfai, , ztn, 9\) 

9ip(xn , • • ■ , x,„ , 0,) + g t p(xn , ,x, n , 6l) 

and 

(1.203) ft, g 2 ?(sn , • • • , x < n , 6!}) 

0iP(r,i , 0$) + gipixa , 0?) 

for i = 1,2. 
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In order to decide whether the hypothesis that 0° belongs to the distribution 
density of xi is more tenable than the hypothesis that it belongs to the distribu- 
tion of x 2 , it is only necessary to compare /3n with fti . But if /3n is equal to or 
greater than dn , the ratio dn/du must be equal to or greater than fti/fta and con- 
versely. For assume that dn > fti • Subtracting dnfti from each side of the in- 
equality we get du(l - fti) > fti(l - dn). But since 1 - dsi = and 1 - dn 
= din , we see that dn/dn > Pn/fin • Conversely, let du/dn dai/dw • Then 
dn(l - dn) > fti(l - dn), or dn > dn • 

From the above it would appear that a sensible sequential procedure for de- 
ciding whether 0° is more likely to belong to xi than to m is as follows: Select two 
positive quantities A and B with A > 1 and B < 1. Take a pair of observations 
(am , Oi„), (a = 1, 2, ■ ■ -), at a time, one from each process. At each step (i.e,, 

for each sample size n) compute the ratio A = — / — . If at any stage A < B, 

Prv dn 

terminate the sampling and accept the hypothesis that o'l is a parameter in the 
distribution density of xi . On the other hand, if at any stage A > A, terminate 
sampling and accept the hypothesis that 0? is a parameter of the distribution 
density in xs . If neither holds, that is if B < A < A, then take another pair of 
observations, consisting of one from each process. Continue this procedure 
until one or the other decision is reached. 1 

The interesting point here is that the decision function A is independent of gi 
and 02 . In fact, it is easily seen from equations (1.202) and (1.203) that 

(1 204) A = p( ' Xn ’ Xu ’ ' " ’ , x n , • • ■ , Xu , tf ) 

p(*u, Xn , • ■ ■ , Xu, el)p(x n , xn , • • • , Xu , 05) ' 

1.3 The proposed sequential test as a special case of a sequential probability 
ratio test. If we examine the expression given in (1.204) we see that it is a ratio 
of two likelihoods. The numerator of the ratio is the likelihood of the 2 n ob- 
servations under the hypothesis that 8l iB a parameter m xi and 0? is a praameter 
in xa ; the denominator is the likelihood of the 2 n observations under the hy- 
pothesis that 0? is a parameter in x 4 and 05 is a parameter in x 2 . Thus, the pro- 
posed sequential test is equivalent to a sequential probability ratio test (see [I]) 
for testing the simple hypothesis that 0$ belongs to xi and d\ belongs to x 2 against 
the alternative hypothesis that d\ belongs to xi and 0° belongs to x 2 . We can, 
therefore, apply the theory of sequential analysis developed by A. Wald ([1] and 
[2]) to this problem. 

While the test is posed in terms of a simple hypothesis, the solution, aB will be 
shown later, is in fact a solution to a composite hypothesis. In order to bring 
this out more clearly we shall rederive a few of the results which have already 
been obtained by A. Wald. This will be done in sections 1.4, 1.5, and 1.6. 

1 That a decision will be reached eventually can be asserted with probability one if the 
variance of the variate z„ (defined by (1.301)) below is different from.zero (or if it is zero, 
the value of z* is different from zero), See [2], Lemma 1 As we shall see later, if, in fact, 
both processes have either 9°, or flj aB parameters, then the above sequential procedure will 
result in the acceptance of either process with approximately equal probability 
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In what follows we shall speak of the hypothesis (0 2 , 0 2 ) to moan tho hypothesis 
that 0i is the value of the parameter in the distribution density of in and 0 a is the 
value of the parameter in the distribution density of m . The hypothesis (0? , 
dl) will represent a specific hypothesis which we may wish to test and will be 
used to define the decision function (the probability ratio) of the sequential test. 

Let us fix A > 1 and B < 1 and set 


(1.301) 


z a = log 


7(3 *-, a!)/(* i« , fl£ri 


where Xi„ is the oth observation from m , Xt„ is the nth observation from t 2 
and (0? , el) is the particular hypothesis to be tested against the alternative hy- 
pothesis (0" , el) Let a = log A and -b = log B. Then a and b are positive. 
Since the observations from m and tt 2 are assumed to be independent, log X = 

n 

X 2 a . Hence the proposed sequential test can be carried out in the following 

a®*l 

manner. Draw one pair of observations at a time, one from n-i and one from in, 
Let zi , z 2 , • • • be the values of z„ obtained from the first, second, etc. trial. 
Let Z n = zi + z 2 + • • + Zn , (n = 1, 2, • • • ). Continue sampling as long as 
-b < Z n < a. Whenever Z„ > a, (n = 1, 2, 3, • • • ), terminate sampling and 
accept ir 2 (or n)- Whenever Z„ < —b, (n = 1, 2, 3, • • •), terminate sampling 
and accept in (or ir 2 ). 

1.3a Basic assumptions. In this section and throughout this paper, wo shall 
be dealmg with sequential tests involving, as above, a decision function Z„ = 
2\ + 2s + • • • 4- z n , (n - 1, 2, • • • , ad inf.), whero the z„’s are independently 
distributed random variables having a common distribution function. Let z 
denote a random variable whose distribution is the same as the common distribu- 
tion of z„, (a = 1, 2, ■ ■ ■ , ad inf.). It will be assumed, even if not explicitly 
stated, that the distribution of z satisfies the following conditions. 

Condition i. Both the expected, value Ez of z and the variance of z exist and are 
unequal to zero. 

Condition ii. There exists a positive 8 such that P(e* >1 + 5) >0 and 
P(e z < 1 - 5) > 0. 

Condition iii. For any real value h, the expected valve Ee” = g{h) exists. 
Condition iv. The first two derivatives of the function g(h ) exist arid may be 
obtained by differentiating under the integral sign, 

1.3b. Fundamental properties of sequential tests, Let z be defined as in 1.3a, 
Then under the assumption that the distribution of z satisfies the conditions 
specified, Wald [2] has proved the following: 

Lemma, i. The probability that a decision is reached in a finite number of steps is 
unity. 

Lemma ii. There exists one and only one real value h ?£ 0 such that the expected 
value Ee h ‘ = 1. 

Fundamental identity: The fundamental identity Ee Zn ‘[<t>(t)r’' = 1 holds 
for all points m the complex plane for which | <p(t) \ > 1 where 4>{t) = Be". 
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Let w = log a nd i e t the distribution density of a; be fix, 0 ) . Let 0 2 and 

62 be any two values of 0 which may be distinct from 0 ? and &\ . Then it can 
easily be verified that if w satisfies the conditions specified in section 1 . 3 a under 
the hypothesis 8 = Si as well as the hypothesis 8 = 8%, and if moreover the ex- 
pected values of w under these two hypotheses are not equal, then 2 = log 


~ will also satisfy these conditions when the joint distribution 
f(x2 , dl)f(x 1 , < 95 ) 

density of xi and x 2 (xi representing the measurable characteristic in m and xt 
in ir 2 ) is either /(.Ti , 81) f(x 2 , 0 2 ) or f(xi , df) f(x 2 , 8{] . 

In what follows, we shall assume that the distribution of w satisfies the re- 
quired restrictions for the 82 and 0 2 under consideration and that the expectation 
of w under the hypothesis 8 — 8 1 is unequal to the expectation of w under the 
hypothesis 9 = 62 Consequently, we shall assume that Lemmas I and II and 
the Fundamental Identity hold for all the sequential tests we shall consider. 


1.4 The power of the proposed test. Let xi be an observation from n and 
an observation from ir 2 . Let 


( 1 . 401 ) 


2 = lnc /fo. ^ 


where 0? and 0° are specified parameters in the probability density of and tt 2 
respectively. Furthermore, let <t>(t | 61 , 0 2 ) = E(e u | 0i , 0 2 ) be the moment gen- 
erating function of 2 under the hypothesis ( 0 i , 0 2 ). Then 


( 1 . 402 ) E(e“ | 0 ! , 0 2 ) 


j — « A_ao , 0 2 


)/(* 1 , $) 


8\)f{xi , 


fix 1 , 0i)/(x 2 , 0 2 ) dx 1 dxi . 


By Lemma II there exists one and only one real number h ^ 0 such 
that E(e h ‘ | 0 i , 0 2 ) = 1 Let L h = P(Z„ < — b\ 0 t . 8f) be the probability that 
the sequehtial test terminates and Z n < —b under the hypothesis ( 0 2 , 0 2 ) . Then 
by Lemma I, 1 — Lj, == P{Z n > a | 0 i , 0 2 ). For any random variable u consid- 
ered under the hypothesis ( 0 i , 0 2 ), let the symbol E b {u) stand for the expected 
value of u under the restriction that Z n < ~b and E a (u) stand for the expected 
value of u under the restriction that Z n > a. In terms of the above definitions, 
the Fundamental Identity can be expressed as follows: 

(1. 403 ) uE^m 1 0 ! , 0 2 )p + (1 - 1 Bi , 8*)r = 1 - 

Setting t — h in ( 1 . 403 ) we get 

(1.404) La Eye 1 '*” + (1 - L h )E a e hz * = 1. 

Following Wald [ 2 ], we define a two valued random variable Z„ in this manner: 
= a if Z n > a and = — b if Z n < —b. Let Z n — Z n = e. Then « is also a 
random variable. In what follows, we shall substitute 0 for «. The error com- 
mitted in neglecting « is small when 8\ is close to 0 2 . As we shall indicate later, 



128 


M. A, G1HSHICK 


the quantity £ can, in fact, be neglected without error in the special case where 
f(x, 0) is the binomial distribution. 

Substituting Z„ for Z n in (1.404) we get 

(1.405) + (1 — £*)«*“ = 1. 

Solving for L\ we get 2 

1 _ t _ e kMi) - fl“ 

(1-406) L h - e _ w -- _ 1 ■ 

As we shall see later, h = 0 when ft, = ftj . But when h * 0, Lk in (1.400) is 
indeterminate. However, it can be easily seen that 

(1.407) lim L h = — ~ . 

' ji-*o a + o 

It follows from (1.406) that the power of the test is constant for all ft, and ft, 
which give the same root t = h. The quantity h is thus fundamental in this 
test, and as we shall see later, is an appropriate measure of the difference between 
ft, and 02 for a large class of distributions. 

1.6 Method of determining the sequential test. Let z be defined as in 
(1.401) and let <t>i(t) = E(e" \ 8° , 0?) be the moment generating function of z 
under the hypothesis (ft?, ft?), and let = E(e'' | ft?, ft?) be the moment gen- 
erating function of z under the hypothesis (ft?, ft?). Furthermore, let « » P(2„ 
= a | ft? , ft?) and 0 = P(Z n = -b | ft? , ft?). Then by Lemma I, 1 — a » P(3J» 
= — b | ft? , ft?) and 1 — 0 = P(Z n = a | ft? , ft?). Now, applying Wald’s Funda- 
mental Identity we have, 

(1.501) (1 - «)e" ,i FH[4) 1 (<)r n + «e'“Fi.[4»i(t)]“" - 1, 

(1.502) 0e-‘ l F»[*(Or n + (1 - P)e"EMt)]- n = 1, 

where the symbol Ei„ Btands for the conditional expectation knowing that *= a 
and E lb stand for the conditional expectation knowing that 2 n — — b ; with both 
expectations taken under the hypothesis (0? , ft?). The symbols Eu and E u 
are similarly defined but under the hypothesis (ft? , ft?). Setting f 1 in (1.501) 
and « = -1 in (1,502), we get, in view of Corollary 2, Theorem 2 below, 

(1-503) (1 - a)e -b + ae B = 1, 

(L504) 0e‘ + (1 - 0)e““ = 1. 


2 In what follows, L h will always stand for the probability that a sequential test will 
terminate withZ„ < -6. In any given problem, the interpretation of the event Z* < -b 
will be clear from the context. 
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Now a = log A and — b = log B. Hence, equations (1.503) and (1.504) become 


(1.505) 

(1 — a) B -f- aA = 1, 


(1.506) 

1 

B' A ’ 


or 



(1.507) 

A = - — — and a => log — 

- p 

a 

CL 

(1.508) 

B = ~ — and b = log ~ 
l — a 

— Of 

18 


From (1.507) and (1.508) we see 'that the sequential test is completely determined 
by the function z, which, in turn, is defined by d\ and d\ , and by the probabilities 
of making a decision for the two hypotheses (0? , &\) and (0j , el). 

Once z is defined in terms of a specific (0? , 62) , the probability that Z n < —b 
will be equal to 1 ~ a and the probability that Z n > a will be a (if we neglect 
the fact that | Z n |, at a decision point, might exceed a or 6) for the totality of 
hypotheses (0i , 0 2 ) for which the moment generating function <{i(t | 6 i, 61 ) - 1 
when t = 1. A similar statement can be made for the corresponding hypotheses 
(02 , 0i) for which the moment generating function will equal unity when t — — 1. 
Hence, we see that while the test is defined by specifying two points (0? , 0$) 
and (0 2 , 0?) in the parameter space, the pre-assigned risks a and 5 of making 
the correct decision will be approximately constant on the set of points for 
which the moment generating function equals unity when t — 1 and when t = 
— 1, respectively. This set of points usually will constitute a smooth curve. 
a 

If 0! = 0 2 , Lo = ~r -j (by 1.407). Hence, the probability of accepting xi 

0 / "j 0 

will be close to \ if a is close to b, and will equal ^ if a = b. But from (1.507) 
and (1.508) we see that a = b if a — /?. Thus, if we construct a test which 
will give a probability of rejecting x 1 when (0? , 0°) is true equal to the probability 
of accepting wi when (0 2 , 0?) is true, we shall be accepting x 1 and x a with equal 
frequency when in fact 0i = 0 2 , 

1.6 The average number of pairs of observations required to reach a decision. 

Let E{n \ 0i , 0 2 ) be the expected numlier of pairs of observations required to reach 
a decision under the hypothesis (0i , 0j) . Wo shall show that 

(1.601) E(n | 0! , 0 2 ) = a JL=J^LZ±h . 

Proof: Differentiating the Fundamental Identity, 

(1.602) Ee lZn [<t>(t)T n = 1, 
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with respect to i, we get 3 

(1.603) B[z* u -mr ~ i = o. 

Setting £ = 0, we get 

(1.604) EZ n - <t>'(0)E(n | 0i , 8.) = 0. 

But 

(1.605) E2. = a(l - L h ) ~ bU 
and 

(1.606) 4>'(0) = Ez. 

Hence, solving for E(n | 8i , 0 2 ) in (1,604) and substituting from (1.605) and 

(1.606) we get 

(1.607) E(n 1 6, , «*) = , 

lhZ 

While L h is approximately constant for all values of (0i , (]<>) for which the mo- 
ment generating function equals unity for t = h the expected value of n given by 
(1 607 ) will depend on the particular hypothesis (&i , 0 2 ). Tins follows from the 
fact that Ez is not necessarily constant for the same set of points (0 t , 0 2 ) for which 
Lh is constant. 


1.7 Some general properties of the proposed test. 

Theorem 1. Let z = log j -| ~~ ’ where xi is an observation from ir* 

and Xi from tt 2 . Then if F(z) is the distribution density of z under the hypothesis 
(8i , Of, F(—z) is the distribution density of z under the hypothesis (0 2 , 0,). 

Proof: Let t be a real number and let M0 = ! 0i , 0a) lx* the character- 

istic function of s under the hypothesis (0i , 0 2 ). Then 


(1 701) 



/(*, g?)/(si, 9a) ~ 
/(a*. «*)/(* i, 0?). 


it 


fix 1 , 9i)f(xi , 0 2 ) dxi dx t . 


Now let fa{t) — E(e ll! | , of he the characteristic function of —2 mider the 

hypothesis (0 2 , Of). Then 


(1.702) Ut)= C nM , 

J— J-» L/(*i , e{)f(xi , 0f) J 1 ’ dxi dx 2 . 


Interchanging the variables of integration in (1,702) we see that M0 « M0- 
Consequently, the distribution of 2 under the hypothesis (0! , Of is the same as 


The ^ ?^ dame J ntal Identit y can be differentiated with respect to i. 
See Waid [I] plge H2 >V Wlth ° ut any referGncc to the Fundament al Identity. 
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the distribution of — z under the hypothesis (0 2 , 0i)- This theorem in conjunc- 
tion with the fact that E(z ! 8 l , 0 2 ) ^ 0 when 0! 0 2 shows that the decision 

function z discriminates in a real sense between the two alternative hypotheses 
(0! , 0 2 ) and ( 02 , 6 X ). 

Theorem 2. Let E(e“ | 0i , 0 2 ) be the moment generating function of z under the 
hypothesis (0 2 , 0 2 ) and let E(e“ | 0 2 , 0 X ) be the moment generating function of z 
under the hypothesis (0 2 , 0j Then, if t = his a root of the equation E(e‘ z \ 6i, d 2 ) 
— 1, then t = —his a root of the equation E{e lz | 0 2 , 0i) = 1- 
Proof: The same as Theorem 1. As we have seen in Section 1.4, the power 
of the proposed sequential test (neglecting e) depends only on h This theorem 
shows that if the probability of accepting m is large under the hypothesis (0i, 
0 2 ), it will be small under the hypothesis (0 2 , 8i), and conversely. 

Corollary 1. The only value of t for which E(e“ | 8, B) = 1 is t = 0 This 
follows from Theorem 2. 

Corollary 2. The values of tfor which E(e tz | 0? , 0 2 ) = 1 and E(e“ | 0 2 , 0?) 
= 1 are t = 1 and t — — 1 respectively This can be seen by expressing E(e tz | 0? , 
0 2 ) as a double integral and setting t = 1 
Theorem 3 Let u be the totality of points (0 t , 0 2 ) in the parameter space for 
which 0i < 0 2 . Then a necessary and sufficient condition that the values of h (for 
which E(e liz \ 0 X , df) = 1) be of the same sign for all points m u is that 

(1-703) Ew ] 0 = £ log f(x, 6) dx 

be a monotonic function of 6. 

To prove this theorem we need the following lemma. 

Lemma 1. Let g(x, 0) be the distribution density of x and \ p(t) its moment gen- 
erating function. Let h be the real non-zero value of tfor which \ p(t) — 1, Then 
the sign of h is opposite in sign to Ex ( the expected value of x) if Ex ^ 0 

Proof: For any random variable u, Wald [1] has shown that the inequality 

(1 704) Eu < log Ee u 

holds 

Setting u = lx, where t is a constant, we get 
(1,705) tEx < log Ee 11 = log \p(l). 

Setting t = h in (1.705) we get hEx < 0. This proves the lemma. 

Now let Biz | 0i , 0 2 ) be the expected value of z under the hypotheses (0! , 0 2 ) 
where (0i , 0 2 ) belongs to o>. Then 

- C f>l::§§rr ! /(I " W(x ” w 

- £ log 0s) dx = Ew I 01 - Ew |0 2 . 


(1.706) 
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From (1.706) we see that if Ew ( d is monotonic in 0, E(z | 81 , 0 S ) will have a con- 
stant sign for all pomts (0i , 0 2 ) in u and hence by Lemma 1, h will have a constant 
sign. Conversely, if h is of constant sign for all (0i , 0s) in o>, so will E(z | 0, , 
di) be. Consequently, by (1.706) Ew | 0 muBt be monotonic. 

Corollary 1. Let Ew \ d be a monotonic function of 8 and kl w* , (h ^ 0), be 
the totality of points (0! , 0 2 ) in the parameter space for which the power of the se- 
quential lest is constant, Then the coordinates of the points (0i , 0s) in m arc such 
that either every 0i < 8i or every 0i > 0s • 

Proof: By assumption all points in w* have the same power. Since Lh in 
(1.406) is a strictly increasing function of h, the points in must yield the same 
h. However, if we assume that wj, contains a point (0i , 0s) with 0[ < 0 S ' and a 
point (0" , 0") with 0" > 0" , the sign of E(z \ $[ , d'f) by (1.706) will be opposite 
to the sign of E(z \ d" , 8"). Hence, the value of h yielded by (0! , Ot) is opposite 
in sign to that yielded by (b” , 0 2 ), which contradicts the assumption that both 
points yield the same h 

Theorem 3 and Corollary 1 show that if Ew | 0 is monotonic in 0, the proposed 
sequential test is unbiased in the sense that all points (0i , 0 2 ) that lie on the curve 
h =? constant (and hence have the same power) will )mve the property that 
either the inequality 0, < 0 2 holds or the inequality 0, > 0 2 holds. The equality 
sign will hold if and only if h = 0. 


1.8 The proposed test applied to distributions which admit sufficient statis- 
tics. Let fix, 0) admit a sufficient estimate of 0. Then it is well known that 
f(x, 6) can be written in the form 4 
(1.801) f{x, 0) = a »W'«>+rW+»». 

Setting a - log ^ ( we see that for this class of distributions the 

j\Xi , 0 2 JJ \X\ , 0,) 

decision function assumes the simple form: 


(1.802) z=[u(* 2 ) - u(u)][»(0!) - «(0j)]. 

a * = r(0!) - r(0° 2 ) and h * = y (0°) - „($ • Then the decision function 
becomes 


(1-803) z* = tt(xj) — u(xf) . 

We shall now show that, for this class of distributions, the power of the sequen- 
tial test is a function of «(0i) — u(0 2 ). To prove this, it is only necessary to show 
that E(e“ | B \ , 0 2 ) equals unity for t = i/(0]) — t>(0 3 ), Now 


(1.804) 


E(e“* 1 0i , 0 2 ) = f f e^-Wfix 1 , 0,)/(x 2 , 0 2 ) dx, dx t 

«'*— to J— ee 





dxj dx 1. 


If we set t — r(0i) 11(62) in (1 804), we see that the statement is proved. 


1 See, for example, [3]. 
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Let En | h be the average number of pairs of observations required to reach a 
decision when u(0i) — i>(0 2 ) = h Then by formula (1.607) we have 


(1.805) 


E(n | h) = 


a*(l - U) ~ b*L h = (1 - Lk) log A + U log B 
E[u{xi) — u(x i)3 hnE[u(xfj — ufxij] 


Since the expected value of u{x) will not necessarily equal «(0), the average num- 
ber of pairs of observations required to reach a decision will depend not only on 
i>(0i) — u(0 2 ) but also on the particular hypothesis (0i , 0 2 ) considered. 

Since the power of the test for this class of distributions depends on v(8i) — 
a(0 2 ), it will be constant for all 0i and 0 2 which lie on the curve defined by v(0i) 
— jj(0 2 ) = constant. In particular, if the sequential test is defined with risks a 
and 0, the probability of accepting tti (or n) will be approximately a for all 
hypotheses (0i , 0 2 ) which lie on the curve defined by v(d\) — i>(0 2 ) = v(dl) — 
v(d°i) = h 0 and the probability of accepting n (or ir x ) will be approximately 0 for 
all hypotheses (0 2 , 0i) which lie on the curve defined by i/(0 2 ) — v(0{) — h 0 . 
Now, the decision function z as well as the boundaries a* and b* will be identical 
for all sequential tests provided they are defined by the same risks a and 0 and 
the parameters 0i and 0 2 which determine the decision function all lie on the 
curve y(0i) — v(0 2 ) = h a . Since Wald [1] has proved that the sequential proba- 
bility ratio test minimizes E{n), the expected number of observations required 
to reach a decision, when the hypothesis tested is true as well as when the 
alternative hypothesis is true, it must follow that in the ease under 1 consid- 
eration E(n) is minimized for all hypotheses (0i , 0 2 ) which lie either on the curve 
defined by v(df) — v(0 2 ) — fi 0 or on the curve defined by r(0 2 ) — v((h) — h 0 . If 
v(6 ) is a monotonic function of 0, then the test is unbiased (i.e. all points (Si , 0 2 ) 
which lie on the curve w(0i) — t>(0 2 ) = constant will have the property that either 
every 0j < 0 2 or every 0i > 0 2 ) . 

For this type of distribution, the importance of the difference between Si and 
0 2 may be measured by v(0i) — v(0 2 ). We shall now show that the function 
v(Si) — t>(0 2 ) is an appropriate measure of the difference between these param- 
eters for a wide class of distributions which often occur in practice. 


1.9 The proposed test applied to known distributions. 

1.9a. The problem of discriminating between means when the variances are known. 
Let f(x, m) be a normal distribution function with unknown mean p and known 
variance a which we shall assume, without loss of generality, to be unity. Let 
Xi be an observation from m and x 2 an observation from ir 2 . Let the distribu- 
tion density of xi bo designated by f(xi , pf) and that of x 2 by f(x 2 ■ p 2 ) . The prob- 
lem is to decide which process has the larger p. 

Since f(x, p) is a normal distribution, it is given by 

(L901) f(x, ^) = — ^e-^* 

Hence f(x, p) is of the form considered in Section 1.8 with u(x) = x and v(p) = 
p. Therefore, the decision function is given by 

(1.902) z* = Xt — Xi 
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and the power of the test depends on h = mi — Ms and is given by (1.406) with 
a and b replaced by o* and b* t respectively. 

The sequential test is performed in the following manner: We take a pair of 

n 

observations, one from m and one from n , in sequence. If at any stage 
— Xi„) < —6*, we accept the hypothesis that n has the larger moan. If, how- 

m 

ever, at any stage £ (®j« - ii«) > a*, we accept the hypothesis that ir a has 

a*" 1 

the larger mean. If neither holds, we continue sampling. According to section 

1 8 a* = — — and —b* = ■ — ^ , where mi — ms is assumed to be positive, 
Mi — Mi Mi ~ Mi 

In order to determine a sequential test, we must fix a* and b*. That is, we 
must fix the quantities mi — mi , A, and B. This can be accomplished by de- 
ciding . (1) the smallest difference between the means of the two processes which 
is considered worth detecting. This determines h 0 — mi — M? > which we shall 
assume to be positive: (2) the maximum probability a of rejecting the hypothesis 
that in has the larger mean when in fact mi in m differs from m in va by as much 
as h a , and (3) the maximum probability j9 of accepting the hypothesis that ?n 
has the larger mean when in fact the difference between m and mi is as large aa 
ha negatively. 6 When a and /3 are fixed, A and B are determined by equations 
(1.507) and (1.508). 

1.9b. The problem of discriminating between variances when the means arc known. 
Let us assume that the distribution of Xi in m and x a in n arc normal with known 
means but unknown variances. We are required to choose that process which 
has the smaller variance. Without any loss of generality we shall suppose that 
the means of xi and x a are zero. Since /(x, a) is normal, it ia given by 

(1.903) -4=- 

V 27r<T 


which is of the form considered in Section 1 .8 with u(x) ~ x and V{o) ~ ~i . 

2 a 

Hence the decision function z* is given by 
(1.904) z* = x\- x? 

and the power of the test depends on h = $(o7 4 - *7*) and is given by (1,406) 
with a and b replaced by a* and b*, respectively. The sequential test ia per- 
formed m the following manner : We take one pair of observations at a time, one 

from in and one from ir 2 . Wo continue sampling as long aa £ (x aer — x?„) lies 
between -b* and a*. Whenever £(xL - xL) > a*, we conclude that <r\ > v?. 

or “1 


nr ° UrVe defi “ ed by d 406 ) is a monotonic function of h - m - w , Hence the 

probabihty of rejecting the hypothec that x, has the larger mean ie < a whenever m - m. 

etatemeJt can “h ‘“J h * maxl 7 T m risk of an erroneous decision. A similar 

statement can be made concerning the riBk f>. 
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Whenever 2 (^a* — xL) < —b*, we conclude that a\ < oj. 

a=*l 

a* and b* are defined by 


* _ log^ 

° “ *[(4)~ 2 - ( A )' 2 


and 


-h* = 


log B 


mr- (*?r*r 


The quantities 


Thus a* and b* are defined by a specific value of <r 2 2 — <n 2 and A and B. If we 

2 ft 1 ___ Q 

take (o-°) 2 - (<r°) 2 as negative, then A — : ~ and B — — where a = proba- 

1 — a a 

bility of concluding that <r* < a] when in fact a? 2 — <rf 2 = — [(o- 2 ) -2 — (er?) -1 ] and 
/3 is the probability of concluding c\ < a\ when in fact aj 2 — <yf 2 — [(v?) -2 — 
(<r?)" 2 ]. 

1 9c. The problem of discriminating between variances when the means are un- 
known, Let the measured characteristics in in and tt 2 be assumed to be normally 
distributed with unknown means and unknown variances. We desire to choose, 
on the basis of a sequential test, that process which has the smaller variance no 
matter what the means are. This will be accomplished by reducing the problem 
to that treated in Section 1.9b. 

Let Xu , Xn , Xu , • • ■ be the successive observations from m and x n , , x 2a , 

■ • • the successive observations from ir 2 . Consider the transformation 


1 1 

Vn = V 2 Xl1 “ yl* 12 ’ 

1 . 1 2 
Vu ~ VM Xn + vTs* 12 ” VO* 13 ’ 


2/Un-l) 


1 1 n — 1 

Vn(n — 1) Xn sj n{n — 1) Xn \/ n(n — 1) Xln ' 


with yn , y& , • • • ■ • ■ similarly defined in terms of x 2l , x 22 , • • • x 2n • • • . 

It is obvious that this transformation can be applied sequentially. Moreover, 
it is easy to show that 

(1) Tho expected values of the y’s are zero. 

(2) The variances of the y’s are the same as the variances of the x’s. 

(3) The y’s arc normally and independently distributed. 

Hence we can apply the sequential test developed in Section 1.9b to the y’s 
without any alterations. The decision function Z* will be given by 

Z* = £ (yl - yl). 


(1.905) 
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But it can be easily shown that 


Y V** = Y (*»« — &)* 

a — 1 


Y Via = Y (®1« “ ^l) J 

a— 1 a-»l 

where & and x 2 are the arithmetic means of the observations in wi and xi respec- 
tively. Hence (1.905) is equivalent to 


(1.906) 


n+1 n-j-1 

z* = Y fa* - — Y (*i» - *i) a ■ 

a “1 «— i 


Thus, to perform this sequential test, the population means need not be known. 
The only difference between the tests considered in 1.9b and 1.9c is that 1.9c 
requires one additional pair of observations. 8 

1.9d. The problem of discriminating between means when the variates have a 

e~ mi mi n 

Poisson distribution. Let the distribution of xi in xi be given by — — ; — and 
the distribution of x 2 in ir 2 be given by ; — where x x and x t each take on the 

Xii 

values 0, 1, 2, • • • . It is desired to teat the hypothesis that the mean in m 
is smaller than the mean in t 2 against the alternative that the reverse is true. 
Since the Poisson distribution can be written as 


(1.907) 




it is of the form considered in Section 1.8 with u{x) — x and vim) — log m. 
Hence the decision function z* is given by 


i* = Xt — x i 


and the power of the test depends on h = log — . The sequential test is per- 
formed in the following manner : We take one observation from n and one from 

n 

W 2 in succession. If at any stage Y (®s« — Xia) < — b*, we conclude that m-, 

*— 1 
n 

is smaller than nil . If Y 0& 2a — Xi«) > a,*, we conclude that mi is smaller than 
m * . If neither holds, we take another pair of observations. This process is 


6 The method employed here was discovered independently by Charles Stein and the 
author as a solution to a different sequential problem 
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continued until one or the other decision is reached. The quantities a* and b* 
are given by 


(1.908) 


a* - l0 ^ 


log Wo 


(1.909) 


b* = 


log 


1 - 0 


a 


log Wo 


where wo = mi /ml which is assumed to be less than one, a is the desired proba- 
bility of concluding that m 2 is smaller than mi when in fact m\/m\ = Wo < 1> and 
0 is the probability of concluding that mi is smaller than m 2 when in fact 
ml/ml = l/«o . The power curve is given by 


(1.910) 


L u = 


u a ' +b ' - it' 
u°'+ b ' - 1 ’ 


where u = mi/m 2 . 

1 9e. Double dichotomies. 7 We are given two processes iri and vi , one yielding 
a fraction defective pi and the other p s . We shall assume that pi and pi are 
unknown. We desire to choose on the basis of a sample that process which gives 
the smaller fraction defective. That is, we wish to devise a test which gives a 
high probability of accepting n if pi < p 2 and a high probability of accepting 
■ki if pi < pi . If pi = pi , we might be more or less indifferent as to which 
process we select, 

Before we can answer this question, we must decide: (a) the minimum differ- 
ence between the two processes which we consider worth detecting; and (b) 
if the two processes differ at least by the amount specified in (a), the minimum 
probability with which we desire to make the correct decision. 

In the proposed test, the decision function is given by z* = — Xi where 

Xj, (i = 1, 2), takes on the values 0 or 1, depending on whether the ith process 
yields a nondefective or defective item. The difference between the two proces- 
ses is measured by 8 u = (the ratio of the odds) . It can easily be 

1— pi 1— p 2 

seen that when u < 1, pi < p 2 and when u > 1, pi > p 2 . If u — 1, pi = p 5 , 
Let ito represent a quantity less than 1. Furthermore, let a be the probability 

of accepting ir 2 when in fact the point (pi , pt) lies on the curve — 2 - tio ; and 

qipi 

0 be the probability of accepting n when in fact the true point (pi , pi) lies on the 


7 For a solution of a more general problem "in double dichotomies using a different 
approach, see [1], section 5 32 and [4] section 3. 

‘ This follows from the fact that the binomial distribution can be written as /(x, p) = 
e iio < (p/ 8 )+]D g5 vvhere x takes on the values 0 or 1. Hence the distribution is of the form 
considered in section 1.8 with v(p) = log p/q, w(p) = log q, and z* = xj — x, . 
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p2Cl 

curve — = uq , 
22P1 


Once u<s , « and j9 arc chosen, we compute 


(1) a* 


1 0 

108 — 

log-uo . 


and 

1 - fi 

(2) -&* = ° B a ' . 

log Uo 

We then proceed as follows: We take one item from each process in sequence 
and cumulate the number of defective d x in process m and d 2 in process x 2 , 
Whenever d 2 — di < —b* t we choose process 7r 5 . Whenever d% — di > a*, 
we choose process iri . Whenever d% — d\ lies between a* and b*, wo take 
another pair of observations, one from each process. This procedure is con- 
tinued until one or the other decision is reached. 

1.9el. The exact value of the power function for double dichotomies. Since 
di — d x changes at most in steps of one unit, it must follow that whenever a de- 
cision is reached at a*, the difference between a* and d 2 — (1\ is cither zero (if 
a* is an integer), or the difference between o* and d, — d 1 is constant, for all 
values of n. A similar argument holdB for b*. This permits us to compute the 
power function without any approximations. Let d be the next positive integer 
larger than a* if a* is not an integer, and & — a* if a* is an integer. Let b lie 
the next positive integer larger than b* if b* is not an integer, and 5 ** b* if b* is 
an integer. Then we see that the equation (1.406) for the power curve con Vie 
given without any approximations by the formula 

(1.9101) L u = (u s+l - u S )/(« S+5 - 1) 

1 9e2. The exact average sample number for double dichotomies. Let Z n — 

di — di and let the point (p x , p 2 ) be on some curve — 1 = u. Let E(n | p l , p*) be 

P*2 1 

the expected number of pairs of observations required before a decision is reached. 
Let L u = probability of reaching —5 (i.e., L„ is the probability that ■*-» is ac- 
cepted). Then 1 — L u is the probability of reaching d (i.e., 1 — L u is the prob- 
ability that tt\ is accepted). Then by Wald’s Fundamental Identity we have® 

(L911) EZ„ = EzE(n | pi, pf). 

Now, Ez = p 2 - , and EZ n = -L« 6 + (1 - L v )a. Hence 

(1.912) E(n | pi , p 2 )'= Ml+ILzJ 

P2 - Pi 

8 For a derivation of formula (1.911) which does not depend on the Fundamental Identity, 
see Wald [1], page 142. 
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It will be noted that while L u depends only on u = , E(n \ pi , p 2 ) depends not 

Mi 

only on the ratio of the odds but also on. the difference between the two fraction 
defectives. 

1 9e3. The distribution of nfor double dichotomies. In this section we shall be 
concerned with the probability of reaching a decision with exactly n pairs of 
observations. 

Let a and lb be two positive integers and let the sequential test be defined by 

» 

the decision function Z„ — z a where z„ takes on the values —1, 0, and 1 with 

probabilities Pi , Pi , and P$ , respectively. In terms of double dichotomies, 
Z* = di — d\ where dt and di are the cumulative number of defectives obtained 
sequentially from it i and n , respectively, and Pi = p x q 2 , Pi = p\Vi + gi ?2 , 
Pi = Piq i , where pi is the fraction defective yielded by n and p 2 the fraction 
defective yielded by ir t . 

By the Fundamental Identity we have for any t in the complex plane for which 

I <#>(1) | > 1, 

(1.913) L u e~‘%m)r n + (1 - L u )e“E i m)r = 1 


p 

where L u is the probability that Z* - — b when pi and p 2 are such that -f- = u, 

rs 

Ei and Ej are the appropriate conditional expectations, and 
(1.914) <t>(t ) — Pi® 1 4* Pi + P» fi< • 


If we examine Wald’s proof of Lemma II [2], we Bee that «#>(t) > 1 for all real 
values of t which lie outside the open interval (0, h) where h is the root of the 
equation 4>(f) = 1 . Hence, it must follow that the Fundamental Identity (1 913) 
must also hold for all real values of t with the possible exception of the open in- 
terval (0, h). This fact will be used in the subsequent discussion. 

We shall first obtain the distribution of n when a = °°. From equation 
(1.910) we see that when a approaches » , L u approaches 1 for u > 1 and u for 
u < 1. We shall assume that u > 1. Then for t negative and a ~ the 
Fundamental Identity (1.913) becomes 

(1.915) e-‘ b P[<K«)r n = 1 


or 

(1.916) W)P = 

Now for all u > l,Pi > P», and hence Ez - P, - Pi is negative. Since the 
real roots of 4>{t) = 1 are opposite in sign to Ez, it must follow that (1.916) holds 
for all t in the interval ( — oo ,0). Now set e 1 = x. Then (1.916) can be written 
as 


F(Pi - + P 2 +P|sP = x b 

X 


(1.917) 
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and (1.917) is valid for all x in the interval 0 < x < 1. 
Now set 


( 1 . 918 ) Pi ^ + P 2 + P,x - ~. 

Then for any specified value of r there will be two values of x, say xi(r) anti x s ( r ) . 
As r approaches 0, one of these values of x will approach zero and the other 
infinity. Let xy (r) be the value of a; in (1.918) which approaches zero as r 
approaches zero. Substituting (1.918) in (1.917) wo get 

(1.919) Er n = [x(r)f. 

But Er " is the generating function of n. Hence if we could expand Er n as a 
power series in r, then the probability Z* = — h in exactly n steps would be given 
by the coefficient of r". We are thus led to consider the expansion of [x(r)] b 
in a power series in r. 

We multiply (1.918) by tx and get 

(1.920) x = t(Pjx 2 + Pjx + Pi). 

Then since Xi(r) approaches 0 as r approaches 0, we can expand [xi(r)] 6 by La- 
grange formula, 10 and get 

(1.921) [xdr)] 6 = It 1 (Pi + P= f + P* 


where the expansion is valid for Xi(r) sufficiently close to zero, Hence, if P*(l>) 
is the probability that exactly n pairs of observations are required to reach a 
decision, then 


(1.922) P n (b) = A (t l (Pi + P^ + P a f 2 )"] h, . 

Now 


(1.923) 


But 


d" -1 


r i (Pi + p J f+P3£ 2 )- , i^ 


= y n ' y ( n ~ *) ! 

,_ 9 t!(n - t)l 3 yZo jl(n — i — j)l 


p,pn-W 


d"~ l 

dt n ~ l 


jHr— 1\ 

t 0 ■ 


(1.924) 

unless n = n + 

(1.925) 


— {' 
di "- 1 5 


■n+f— y+6— 1 




M 


= o 


— j + b, i.e., j = i + b, in which case 

j"-i i 


df 






= (n - 1)1 


10 See, for example, Mathematical Analysis, Vol. 1 (paragraph 189), by Gouraat-Hedrick. 
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Also, since the subscript j ranges from 0 1 on — i, it must follow that j <n — i. 
Hence, i + b < n — i, or i < Substituting (1.924) and (1.925) into (1,923) 

and simplifying, we get for P n (b) 


(1.926) 


Pn(b) 


, (n - l)lP{ +b Pr 2i ~ b Pi 
wil(t + 6)1 (n- - 2» - b) I 


where m *» when n — b is even and m = - — ^ when n — b is odd. 

We shall now obtain the distribution of n when a is finite. 

As before, let xi(t) and xj(r) be the roots of the equation (1.918). Then from 
(1.913) we have 

(1.927) L u [x 1 (r)r i E l r'' + (1 - L„) [*,(r)]V = 1, 

(1.928) LWrfV + (1 “ L u ) [x*(r)]“^jr n = 1. 


Solving for Eir* and Fir" from (1.927) and (1.928) we get 


(1.929) 


T « » „ [si^Wr)] 6 ^) 1 ' ~ glfr)"] 

“ lT i,(f)-** - Xi(r)‘+» 


» *»(t)* — 

(K930) (1 — Eu)Etr = _ xi(r) B+i " 

We shall first obtain the probability Q»(b) that Z* = -b. This is given by the 
coefficient of r" in the expansion of LuEit" in a power series in r. From (1.918) 

P 

we see that Xi(t)Xj ( r) = ~ . Hence we can write (1.929) as 


(1.931) 


L„Fit" 


^(r )fc -(0^ (r)t+2a 

i - (&'P x iM tt+2a 

V 1/ 


Applying Lagrange formula, we get for Q n {b) 

(1.932) Qn(b) « - [(ft + ft f + p » E 2 ) n /'(t)]t-o 


where 



(1.933) 


/(«) = 
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But /(£) can be expanded in a power series in f, 

V v 1 fPt\* bHa ,(*«■«««■ /P 3 \ W+< * +1>3 *(»+l)6Htt+l). 

(1.934) /©-g(pj « “ (pJ * 

Hence 

0.(6) = ^ E t(2fc' + 1)6 + 2ka] 

. [t (Sfc+1)b+2 *°- 1 (Pi + P 2 £ + P 3 ?) n } ( ~ a 

d?- 1 

(1.935) „ /n 

--.E[(2* + l)6+ (2A + 2)a] (,/) 

n! *-o V i/ 

. ill (p t + Psf + p 3 f 4 ) ,, i e „ (t . 

d£" -1 

Comparing (1.935) with (1.922) we see that 

(1 93G) Q„(6)=P n (b) - (JJ P „(6 + 2a) + (^j P n (3b + 2a) - 

the terms in the series being alternately of the form 
(5 V +ta P,[(2fc + 1)6 + 2 ka] and 


- (^y +ik+l)b PnK2k + 1)6 + (2 k + 2)a], for k = 0, 1, ■ ■ • 


The series stops by itself as soon as the argument of P„ becomes greater than n 

If we compare (1.930) with (1.929), we see that the probability that Z* — a 
with exactly n pairs of observations is given by (1 .936) with o and b interchanged 
and the lesult multiplied by (Ps/Pi)°. 

It is to be noted that the problem of double dichotomies is similar to the fol- 
lowing problem in games of chance . Two players A and B, possessing a and 5 
dollars, respectively, are playing a game of chance which admits a draw. The 
stake is one dollar per game. The probability that A will win one dollar is 
Pi , the probability that B will win one dollar is P 3 and the probability of a draw 
is Pi . In terms of this game, L„ given by (1.910) is the probability that B 
will be ruined in the long run, and Q n {b) in (1.936) is the probability that B will 
be ruined in exactly n games. 

For a discussion of games of chance which do not permit a draw, see Introduc- 
tion to Mathematical Probability , Chapter VIII, by J. V. Uspensky. The develop- 
ment presented above is in some respects similar to that given in Uspensky’s 
book In Part II, we shall give a different and more general approach to the 
problem of derivmg the distribution of n for sequential tests in which the variate 
takes on a finite number of integral values. 
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AM APPROACH FOR QUANTIFYING PAIRED COMPARISONS AND 

RANK ORDER 1 

By Lotjis Guttman 
Cornell University and War Department 

1. Summary. Research for the Army demobilization point system evolved 
a new approach to paired comparisons and rank order. Each of N individuals 
compares or ranks n things, the problem is to determine a numerical value for 
each of the n things that will best represent the comparisons in some sense . The 
new criterion adopted is that the numerical values be determined bo as best to 
distinguish between those things judged higher and those judged lower for each 
individual. Least-squares is employed in the analysis, and the solution appears 
in the form of the latent vector associated with the largest root of a matrix ob- 
tained from the comparisons or rankings. 

This approach applies to the conventional problem of ordinary comparisons, 
the numerical solution being easily obtainable by simple iterations; the conven- 
tional use of hypothetical variables and unverified hypotheses is avoided. The 
Army point system is an example of a new and more complicated class of prob- 
lems; the same principle for the solution applies here, only more details occur 
in the derivations and computations. 

2. Introduction. The problem of paired comparisons arises when it is desired 
to obtain numerical values for a set of n things, with respect to one characteristic, 
such that these values will represent the judgments of a population of N in- 
dividuals. 

One procedure for obtaining the judgments is to have the individuals compare 
the things two at a time and to judge for each comparison which of the two 
things should be given the higher rank. An alternative procedure is to have 
each individual rank all the n things simultaneously. Such a ranking implies 
judging all the n(n — l)/2 comparisons at once; hence, the two procedures are 
substantially equivalent. Two noteworthy differences between the procedures 
are : (a) comparing two things at a time allows inconsistencies to appear within 
judgments of an individual, and (b) it is sometimes harder in practice for people 
to judge n things simultaneously than to compare them two at a time. 

The problem of quantification, of course, is identical for both procedures, so 
we do not distinguish between them in this paper. The judgments vary from 
person to person (and possibly within a person), and the problem is to determine 
a set of numerical values for the things being compared that will in some sense 
best represent or average the judgments of the whole population, 


‘Adapted from Report D-3, "An approach for quantifying paired comparisons,' 1 Re- 
search Branch, Information and Education Division, Headquarters Army Service Forces 
Washington, D C , 1945. 


144 



r AIMED COMPARISONS 


145 


In some situations, the things being compared may be single items or objects, 
this we shall call the case of ordinary comparisons. In other situations, the 
things may be combinations of items or objects. 

This paper is devoted to the presentation of a general approach to quantifying 
comparisons or rank orders, with particular application to ordinary comparisons 
and to the comparison of combinations of two things It seems to differ from 
previous approaches in at least two important respects: (a) it is based on but one 
simple principle, namely, that the quantification shall be the one best able to 
reproduce the judgment of each person in the population on each comparison; and, 
as a consequence, (b) the approach yields solutions not only to the traditional 
case of ordinary comparisons, but also to more complex cases that do not seem 
to have been discussed previously. 

An example- of a major practical use of this approach is witjh respect to the 
demobilization score card of the United States Army. The problem was to 
determine the number of points to assign each of the variables on the score card 
according to the opinions of the soldiers themselves. The research on this was 
based on a form of paired comparisons more complicated than the ordinary one, 
and had additional complications of curvilmearities of various sorts in the data. 
Our approach handles such problems as well as the problem of ordinary com- 
parisons. 

Let us describe the score card problem in somewhat more detail. In a survey 
of enlisted men throughout the world by means of a questionnaire administered 
by field teams of the Research Branch, it was found that there were five variables 
that the men thought should receive consideration on the score card to determine 
order of demobilization : length of time in the Army, length of time overseas, 
amount of combat, age, and number of children. 

The problem now was to determine how much weight to give each of these 
variables in obtaining total scores. According to ordinary paired comparisons, 
one would ask, for example, “Who should get out first after the war: a man 
who has two children or a man who has been in two battles?” But respondents 
refuse to judge such a comparison because the battle experience of the first man 
is not specified, nor is the number of progeny of the second man, so that there is 
insufficient basis for judgment. 

Therefore, in the actual research, judgments were asked on each of ten com- 
parisons put in the following form: 

“Here are three men of the same age, all overseas the same length of time. 
Check the one you would want to have let out first: 

A single man .... through two campaigns of combat 

A married man with no children .... through one campaign of combat 

A married man with two children . . . not in combat.” 

Each variable was compared with every other one in this fashion. 

The equations were derived for computing the relative number of points to 
assign to each month in the army, each month overseas, etc., which would be 
most consistent according to our principle. These are essentially the equations 
developed in section 0 of this paper. 
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The results showed strong curvilinearities in the men’s judgments. Amount 
of combat received one amount of emphasis when compared with age , and another 
amount of emphasis when compared with number of children. Since the score 
card would be too complicated in practice if curvilinear scoring were used, 
equations were derived for the linear scoring scheme that would be most con- 
sistent according to our principle. These are essentially the equations derived 
in section 7. The weights arising out of the research were computed from such 
equations. 

The variable age received a slight negative weight, which justified dropping 
it from the score card. The weights the Army finally adopted for the remaining 
factors were modified from the research weights, but yield essentially the same 
results as the research weights. Demobilization scores obtained from the one 
system of weights correlate very highly with scores obtained from the other. 

It can now be revealed that the Army’s modification was essentially to reverse 
the weights for children and battles. In subsequent attitude surveys on how 
well the soldiers liked the point system [8], a major complaint was found to be 
that battles got too little weight compared with babies ! 


3. The basic principle. Our basic principle in deriving numerical values — let 
us call them "x-values”— for the things being compared requires that the x- 
values of things a given person judges higher than other things should be as 
different as possible from the x-values of the things he judges to be lower than 
other things. This will be achieved if we make the x-values of things judged 
higher as homogeneous as possible among themselves, and the x-values of things 
judged lower as homogeneous as possible among themselves, for each individual. 
In the language of analysis of variance, our principle calls for minimizing the 
variation 'within individuals, compared with that within the group as a whole. 2 
The resulting x-values will tend to be the best for reproducing the judgment of 
each individual on each comparison with a minimum overall proportion of 
errors of reproduction [3, pp. 342-343], The smaller this overall proportion of 
error, the better the quantification represents the data. Least squares is used 
for convenience for measuring variation in deriving the equations. 

The previous literature, on ordinary paired comparisons,' seems to have 
concentrated largely on the problem of estimating the differences between means 
of hypothetical variables assumed to underlie the judgments. Thurstone has 
shown that by using assumptions of normality of distribution, equality of vari- 
ances, and zero correlations among hypothetical variables, it is possible to 
estimate relative distances between means for some kinds of data. 


Bpp JJ™ ™ ? for qualification was suggested by previous work on scale analysis; 

see 131 This theory has been developed further by the defimtion of a perfect scale in 
i n . Th f equatl0D8 for l he P erfect scale have interesting properties that may be related 
to pawed comparisons, these equations are being prepared for publioation. The referees 

have called my attention to related work on quantification by R A. Fisher in [1, p 283] 

217-M3 8 ? 0 Fnr T y th \ PreV r WO r ^’ including that of Thurstone, is given in [2, pp. 

zu M.j [. tor more recent work, see [7] 1 1 
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The problem of estimating differences between means is not identical with 
that of reproducing individual judgments. For example, it can be shown, 
within the same framework of hypothetical variables conventionally used, that 
if variances are unequal and/or correlations are unequal then the means of the 
hypothetical variables are not in general the best quantification for reproducing 
individual judgments; the principal axis of certain product-moments of raw 
scores is the best quantification. It is in the special case where variances are 
equal, and where correlations are equal— not even necessarily equal to zero— 
that the principal axis is the set of means. Proof of this is given in the appendix. 

The approach of this paper does not use hypothetical variables, but inquires 
directly as to what numerical values can be derived from the observations tha t 
will best reproduce those observations. 

In the next section is treated the case of ordinary comparisons. The more 
complicated problem of the demobilization score card is formalized in section 5, 
and the equations for its unrestricted solution are derived in section 6. Since 
the unrestricted solution brings out curvilinearities that may be present, and 
since the score card in practice required a linear scoring scheme, equations for 
the most consistent linear quantification are derived in section 7. These are 
essentially the equations used in the research on the weights for the score card. 

The appendix shows a distinction between the conventional principle of 
estimating mean differences of hypothetical variables and the present principle 
of representing the comparisons of each individual. 

4. The case of ordinary comparisons. Paired comparisons as treated in the 
literature seem concerned largely with the ordinary case where separate things 
are compared, rather than where combinations of things are compared. Our 
principle covers the ordinary case as well as more complex cases, and we shall 
treat the ordinary case first since it involves less details. 

Let Oi , Oi , • > » , O n be the n things to be compared, where the assigning of 
subscripts is arbitrary. Each of N individuals is asked to make judgments of 
the form that 0 , is higher than (or lower than) 0* . For convenience, we assume 
the rules of the experiment to exclude judgments of equality. We shall also 
assume that all people compare all the pairs. Hence, there are N sets of n(n — 
1) /2 comparisons. Considering each comparison as comprising two judgments— 
one of “higher than" for one object and one of “lower than” for the other — there 
is a total of Nn{n — 1) judgments in the experiment. 

The judgments of all the individuals on all the comparisons can be represented 
compactly as follows. Let 

1 if individual i judges 0, > 0* 

(4.1) Bi jk = 0 if individual i judges 0/ < 0* 


0 j = k. 

The ranges of subscripts, whether free or dummy, will always be: 
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(4.2) 


i = 1, 2, • ■ • , N 
j,k = 1,2 , ■■■ ,n, 


so that the ranges will not be explicitly stated again. 

Definition (4.1) implies that if e,,* = 1, then e lki = 0, and that 

(4.3) e,y* + e«,y =1, (j ^ AO- 

Let /,y be the number of things individual i judged to be lower than 0/ , and 
jet g tl be the number of things he judged to be higher than O t . Then 

(4 4) f {j s= 10 fiofc , flu s Z ea, ■ 

k k 

From (4.3) and (4.4), we have 

(4.5) fij + g t , = n — 1. 

Let F be the total number of comparisons made by each person ; then 
(46) F-t.(n-l)/2-E/a-E^. 

k k 

Let c be the number of times each Oy was judged in the whole experiment, and 
let C be the total number of judgments in the experiment: 

(4.7) c - N(n - 1) a 10 (. f i7 + g tl ), C = Nn(n - 1). 

» 

Both c and C count each comparison as two judgments, one of “lower than” 
and one of “higher than.” 

The means and variances to be considered are defined as follows. Let x t 
be the numerical value to be derived for Oy on the basis of the comparisons. 
Let t, be the mean of the s-values of the things individual i ranked higher than 
the other things, weighted by the respective frequencies of the judgments, and 
et y, be the sum of squares of deviations from their mean of these ^-values: 


< 48 > 

r h 

9) y< - Z (xk - t,?u = Z xlu - t* f. 

k k 

Similarly, let w, and a,- be the mean and sum of squares respectively for the x- 
values of the things individual i ranked lower than other things: 


(4J0 > n-i £*.»„. 

f k 

( 4 .H) s Z (** - u ‘f 0 .* = E 4 g <k - u] F. 

h k 


Let V be the mean of all the rvalues in the experiment, and let W be the 
ot squares of deviation from their mean of the z-values ■ 


sum 


(412) y = = 

( 4 . 13 ) W = Z(xk-V)'c=cJ: k x l-rC. 

h k 
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W is the total sum of squares for the experiment. Let R be the sum of squares 
between individuals, and let S be the sum of squares within individuals: 


(4.14) R = E l(U - Vf + fa - Vf\F = *E {t\ + u\) - V*C, 

* t 


(4.15) S = E (y, + « ( ) = W ~ R. 

I 


Our principle is to quantify the judgments by obtaining the x-values that will 
minimise the variation within individuah compared to that of the group as a whole. 
This means making S as small as possible compared with W, which is equivalent 
to making R as large as possible compared with W. 

Therefore, if we define the correlation ratio E by 

(4.16) = 1 - S/W, 

the problem is to determine the Xj that will ma ximi ze 
A convenient formula for E 1 is, from (4.15) and (4.16), 

(4.17) If- = R/W. 

Since if is invariant with respect to translations of the x-values, we can without 
loss of generality set 

(4.18) V - 0. 


Then we can write from (4.14) and (4.13), respectively, 

(4.19) R = FE (A + u\) 

\ 

(4.20) W - cE ** • 

k 


To find the maximizing values x ,■ for if, we differentiate the right member of 

(4.17) with respect to the Xj , set the derivatives equal to zero, and obtain the 
stationary equations 


(4.21) 


SR . dW 

— = E i — . 
Sx, dXj 


The derivatives of R can be evaluated by differentiating the right member of 
(4.19) with the aid of (4.8): 

(4.22) - p E ** £ {fain* + 9i> £«)■ 

From (4.20), the derivatives of W are 

dW 
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then (4.21) can be re-written from. (4.22), (4 23), and (4.24) as. 

(4.25) £ xjtffjJk = E* x > ■ 

Equations (4.25) are the equations to be solved numerically for the maximizing 

x, . 

Before indicating a procedure for the numerical solution, let us first verify 
that a solution of (4.25) will satisfy (4.18). Summing both members of (4.25) 
over j, and. using (4.24) and relations among the notation previously defined, 
we get 

£ ** = E 2 £ a,-, 

h 1 

or, from (4.12), 

(4.26) (1 - E 2 ) 7 = 0. 

Therefore, if E 2 ^ 1, we must have 7=0. Since a perfect correlation ratio 
will not in general occur in practice, condition (4.18) will in general be satisfied 
by a solution of (4.25). 

There is always a trivial solution of (4,25) for which E 2 is formally equal to 
unity. This is x, = 1. For this trivial solution, U = Mi *» 1; R = I V ~ C; 
E 2 = 1; and (4 25) is satisfied. Of course, E is not an actual correlation ratio 
for this trivial solution. 

The non-trivial solution of (4.25) can be carried out with the aid of matrix 
algebra. Let x be a row vector of the n elements x t , and let H lie the n X n sym- 
metric matrix ||i/rt,||. H is not only symmetric but Gramian, since its ele- 
ments are product sums. Now (4.25) becomes the matric equation 

(4.27) %H = E 2 x . 

Equation (4.27) shows that a; is a latent vector of H, and E 2 is a latent root to 
which this vector corresponds. Since we want the largest possible correlation 
ratio, we seek the largest of the non-trivial roots. If the two largest non-trivial 
roots are not equal, which should be the general case in practice, then there is a 
unique vector associated with the largest root which is the solution to our 
problem. 

The numerical solution of (4.27) can be carried out by the simple iterative 
technique for latent roots and vectors (see, for example [6]). The iterations 
converge in general to the vector associated with the largest root. To avoid 
convergence to the trivial solution (which formally has the largest root), the 
trial vectors should be adjusted to satisfy (4.18) , then they will converge in 
general to the vector associated with the largest non-trivial root. 

A good way to choose a first trial vector is first to guess what the rank order of 
the r-values will be. Let r, be the guessed rank of Xj , the r,- comprising the 
integers from one to n. If n is odd, then as the first trial x , use r, — (n + 1) /2, 
If n is even, then as the first trial x t use 2 r, — n — 1. 
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A marginal check on the internal consistency of the judgments of the popula- 
tion is to compare each difference ( x — x k ) with the corresponding difference 
~ XJ c <kt). If the population’s judgments are sufficiently consistent, 

the signs of the two differences will be alike for all the comparisons. X! e <ik 

is the frequency with which 0, is judged greater then Ok, and can be used as a 
basis for guessing the ranks of x, and x k . 

6. Comparing combinations of two things. The problem of the score card is 
but one example of a class of problems that can be formalized as follows. Con- 
sider a set of n items, where the jth item has m, categories. Let 0 IJt be the pth 
category of thejth item, (p = 1, 2, • ■ • , m, ■ j = 1, 2, • - • , n). The 0, P may be 
either qualitative or quantitative, and the order of subscripts assigned the 
categories can be arbitrary. 

Each of N individuals is asked to make judgments of the form that the com- 
bination (0, j, , Ok , ) is greater than (or less than) the combination (0, q , Ou). 
We shall assume that all people compare each of the pairs of combinations, and 
that the rules of the experiment exclude judgments of equality. 

The judgments of all the individuals on all the comparisons can be repre- 
sented compactly as follows. Let 

,, ,, 1 ^ individual i judges (O ip , O kr ) > (0 , 5 , O k .) 

0 otherwise. 

Here and throughout this paper the ranges of subscripts, whether free or dummy, 
will always be as follows: 

i = 1,2, ••• ,N 

(5-2) j, k = 1,2 , • • • , n 

P> 9> r > s = lj 2, ■ • ■ , m, , (or to* , as the case may be), 

so that the ranges will not be explicitly stated again. 

Definition (5.1) implies the symmetry 

(®‘^) Ctifc/pr.jj = 6ik,/rp,iq , 

and that 

0 if individual i omits the comparison of ( 0 , r , 

,, .. . _ Okr) with (0„ , O k( ) 

G\ jkjpr,q» r &t; k/<?»,pr — 

1 if he judges these two combinations to be 
unequal. 
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Additional notation is defined as follows. Let a,i,k/pr bo the number of com- 
binations individual i judged to be lower than (0 , p , Ob) > and let b i,t/ pr be the 
number of combinations he judged to be higher than (Oj P , Ob) ■ 


(5.5) 

dtjk/pr = £ £ e *7 kfpr t qa “ a ihifrp 

Q * 

(5.6) 

b(jkfpr — £ £ “ bikjfrp > 

q a 


Let Cjk/pr be the number of comparisons for all individuals involving (O lp , Ob): 

(5.7) Cjk/vr = S (flu*/ pr + b.)i/pr) — Cki/rp ■ 

% 

Let /.(p be the n um ber of times that O it occurred in combinations that were judged 
to be higher than other combinations by individual i, and let gap be the number 
of times 0, p occurred in combinations judged lower than others: 


(5.8) 

f%1P — £ £ dilkfpr — £ £ dibj/rp 
hr hr 

(5.9) 

QilP s £) ^ ^tjk/pr — £1 bikjfrp 

k r k r 


Let A, p be the total number of times in the entire experiment that O iP was 
judged: 


(5.10) 


2 (/»'j> 4 " i hlr) s £ X) C )t/J>r 


Let F be the total number of comparisons made by each person, and lot C be 
the total number of judgments in the entire experiment (a comparison com- 
prises two judgments, one of “higher than” and one of “lower than”) : 

(5.11) F = Ef,„ = Ei: OUp , 

ip j p 

(5.12) 0 = £ £ Alp = 2 NF. 

i p 

The means and variances required for the problem are defined as follows. 
Let i/p be the numerical value to be derived for 0/ P from the judgments. Let 
t, be the mean of the x-values of the combinations individual i judged to be 
higher than other combinations, weighted by the respective frequencies of such 
judgments, and let u, be the analogous mean of combinations judged lower than 
others: 


(5.13) u = iL E £ £ (x,p + X kr ) 0 Wpr s | £ £ Xkrfib, 

r i K V r r k r 

(5 14) V, m - £ £ £ E (x,p + Xb) b, ik /pr m 1 £ £ Xbg<kr . 

Let y< be the sum of squares of deviations from their mean of these “higher 
than" x-values, and let z, be the analogous sum of squares for the “lower than” 
x-values: 
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* 

(5.15) 


(5.16) 


!/, = Z H S E C x ip ”i“" ti) ikfpr 

1 k p r 

sZZZZ (x„ ”f" %kr) Cltjk/pr 

j k p r 

2, = ZEZZ “1“ %kr Uf) 1*1 fk/pr 

I k p r 

= Z/ Z3 ^3 Z3 ( X 1P “l” x kr) Ikijk/pr U> F, 
1 k p r 


Let V be the mean of all ^-values, weighted by their respective frequencies 
in the entire experiment, and let W be the sum of squares of deviations from 
their mean of these x-valucs: 


(5.17) V Z3 Z2 Z3 Z ( X ip 4" x kr)cik/pr — Z Z x kr-A-kr , 

k; I k p r L. k r 

W= EEEEfe + %- vfc )k , ft 

3 k p r 

(5 ' l8> - £ £ £ 2 (*„ + - Vc. 

i k p r 

W is the total sum of squares for the experiment. Let R be the sum of Squares 
between individuals for the experiment, and let S be the sum of squares within 
individuals: 


(5.19) R - Z [«, - TO 1 + («« - Vf]F = F Z (<J + «J) - 7 2 <7, 

i < 

(5.20) S « Z (tfi + *) - TP - «■ 

I 

Our principle for quantifying the judgments is to derive the x-values that will 
minimize the variation within individuals compared with that within the group 
as a whole. This means making S as small as possible compared with W. 
Therefore, if we define the correlation ratio E by 

(5.21) E 2 = 1 — S/W, 

our problem iB to determine the x /p that will maximize E 1 . 

A convenient formula for E 2 is, from (5 20) and (5.21), 

(6.22) E 2 = R/W. 

Since E 2 is invariant with respect to translations of the ^-values, we can 
without loss of generality set 

(5.23) 7 = 0. 

Then we can write, from (5.19) and (5.18) respectively, 

(5.24) R = F ZD (*J + u\) 

i 

W = Z Z Z Z ( X 1P + Xkrf Cik/rr . 

i k P r 


(5.25) 
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6. The unrestricted maximum. To find the maximizing x-values for E 2 , 
we differentiate the right member of (5.22) with respect to the x , p and set the 
derivatives equal to zero This yields the stationary equations 


(6.1) 


BR » dW 

— Jf— . 

uXjjt uXjp 


To evaluate the partial derivatives of R, we differentiate the right member o f 
(5.24), using (5.13) and (5.14), and obtain 


(6.2) 


3 — ri 23 23 ^3 “H QtjpQikr)- 

O'. tin /' Ic r . 


Similarly for W, we differentiate the right member of (5.25) and obtain 


(6.3) 


dW 

dx ]p ~ 4( - x ‘* A ” + 


^3 23 %krCj k/pr) 

k. X 


From (6.2) and (6.3), (6.1) can be written as 


(6.4) 


23 23 (X;p-d;j) "f" 23 23 %kr Cjllpr) , 

k t hr 


where 


(6 5) 


h,lt/pr — ^,23 (fiipftkr QiirQ\kr)- 


The numerical solution of the a; -values is to be obtained from (0 4). 

Before showing a procedure for the numerical solution, let us verify that a 
solution of (6.4) will also satisfy (5.23). Summing both members of (6.4) ovei 
j and p, and using (6.5) and relations among the notation laid down in the pre- 
vious section, we get 

23 23 XkrAkr = is 2 (23 23 x iP Aj P + 23 23 **r a*,) 

k r ip hr 


or 

( 6 . 6 ) (1 - E 2 )23 E XkrAkr = 0 . 

k r 

From (5,. 17), this can be written as 

(6.7) (1 — E 2 ) V = 0. 

Therefore, if E 2 ^ 1, we must have V = 0. Hence, any solution of (6.4) which 
does not yield a perfect correlation ratio must have a weighted mean of zero for 
the x- values. Since a perfect correlation ratio will not in general occur in 
practice, condition (5.23) will in general be satisfied and is no restriction. 

It should be noted that there is always a trivial solution for which i f is for- 
mally equal to unity. The trivial solution is to set x, p = 1. Then t, = m = 2; 
R == W = 4C, E 2 = 1, and (6.4) is satisfied since it reduces to (6.7). For this 
trivial solution, E is of course not an actual correlation ratio. 
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The non-trivial numerical solution of (6 4) can be carried out in practice with 
the aid of matrix algebra Instead of regarding the x ]P as elements of a table 
with n rows with m, elements in the jth row, consider the rows of such a table 
placed end to end to form a single row of M = m, elements Denote this 

3 

as the row vector x. Correspondingly, consider the values h,k/ pr arranged to 
form the elements of a symmetric matrix H of M rows and columns; consider 
the M values A IP to be the diagonal elements of an M X. M diagonal matrix A; 
and consider the values of c ik /pr arranged to form an M X M symmetric matrix C. 
Let A = \E 2 . Then (6 4) becomes in matric form. 

(6.8) xH = A(xA + xC) = Xx(A + C). 

In the next paragraph it is shown that, in general, (4 + C) is non-singular, 
so that it has an inverse by which the members of (6 8) can be postmultiplied, 
yielding 

(6.9) xH{A + Cy 1 = Ax. 

This shows that x is a latent vector of H(A + C)~\ and X is the latent root to 
which this vector corresponds. Since we want the largest possible correlation 
ratio, we seek the largest of the non-trivial latent roots. If the two largest non- 
trivial roots are not equal, which should ordinarily be the case in practice, then 
there will be a unique latent vector associated with the largest root. 

It is of interest to show that all the latent roots of H{A -+• C)~ l are real and 
non-negative, and that all the latent vectors are real. First, we notice that H 
is Gramian, for its elements are product sums. To see that A -f- C is Gramian, 
we notice that from (5.18) and (5,10), 

(6.10) W = 2£ + 2 2 ) ^ y ) Xjp Xlf Cjkfpr V Cj 

IP 1 k p r 

or, in matric notation, and transposing members, 

(6.11) 2x(j4 + C)x' = W + V 2 C. 

Since IF is a sum of squares, the right member is clearly non-negative ; and hence 

(6.12) x(A + C)x' 0, 

for all x. Thus, A + C is nonnegative-definite, or Gramian Furthermore, 
A + C is in general nonsingular, because according to (5.17) and (5.18), V and 
W cannot vanish simultaneously unless 

(6 13) (®,j> d" %kr)G ik/pr — 0 

If n =£ 3, then (6 13) will ordinarily imply that x, p = 0, that is, the equality in 
(6.12) will hold if and only if x = 0. In such a case, A + C is positive-definite, 
or is nonsingular as well as Gramian, and possesses an inverse. 

As is well known, the inverse of a Gramian matrix is Gramian (see [5, p 71], 
for example), so that (A + C) -1 is Gramian. That the latent roots of H(A -f- 
Cy 1 are all nonnegative follows from a general theorem that the latent roots of 
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the product of two Gramian matrices are always nonnegative [5, p. 116] The 
proof of this is brief, and will be repeated here in a little different variation in 
order to prove in addition that the latent vectors are all real. Let G be a sym- 
metric square root of A + C, so that G 1 = A + C. If we postmultiply both 
members of (6.9) by G, we can write the results as: 

(6.14) (xG)(G~ l //G~ 1 ) = X(xG). 

This shows that xG is a latent vector of G^HG' 1 corresponding to the root X. 
But G^HG" 1 is symmetric, and in fact Gramian, for it can be written in the form 
(G _1 lf) (G~ l K)\ where KK' = H, Hence, each X is nonnegative, and each 
xG is real, whence each x is real 

The numerical solution of (6.9) can be carried out by the simple iterative 
technique for latent roots and vectors (see, for example, [6]). The iterations 
converge in general to the vector associated with the largest root. To avoid 
convergence to the trivial solution (which formally has the largest root), the 
trial vectors should be adjusted to satisfy (5.23); then they ivill in general 
converge to the vector associated with the largest non-trivial root. 

A marginal indication of the internal consistency of the judgments is the 
agreement in sign of 

( X IP + **r) - (x lq + Xk.) 

with 

Gliklqs , jit ) 

» ( 

for each of the comparisons. If one combination is. judged higher by more 
people in comparison with another, then its s-values should exceed those of the 
other for marginal consistency. 


7. The maximum under certain linear restrictions. In the previous section, 
no restrictions were placed on the x,„ in maximizing E 2 For some problems’ 
the 0„ may be quantitative, and vt may be desired within each item to keep the 
distances between the x, v proportionate to the distances between the O lp . This 
was the case for the score card, where a linear system of weighting had to be 
used to ^ practicable for the army. It was necessary to derive a constant 
multiplier for length of service, a constant multipler for time overseas, etc., 
even though there were curvilinearities in the judgments. 

Our principle enables us to handle such restrictions just as well as the un- 
restricted case. We shall derive the set of multipliers which is most consistent 
or the judgments m the sensp of least squares. The ordering of categories 
withm an item will no longer be considered arbitrary. Instead, subscripts will 
te assigned m a fashion to make (0 W - 0„) proportional to Xp - q) vHthm 
fo h T't F ° r COnVemenCe * the subs cripts can be assigned beginning from zero 
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The linear restriction is to determine .x -values in the form 


(7.1) 

x iv = £i + PVt , 

where the £ , and the i?, are now the basic unknowns to be solved for to maximize 
E 1 . It is the 77 , that are of interest, for they will be the multipliers; but the,£j 
have to be used in the analysis to help determine the multipliers even though 
they are only additive constants that will not affect the order of total scores of 
people. 

To maximize E 2 under the linear restrictions, we differentiate the right mem- 
ber of (5.22) with respect to the £, and the tj, , set the derivatives equal to zero, 
and obtain the stationary equations 

(7.2) 

dR „ 2 dW 

(7 3) 

dR = E 2dW 

dr/, di j, 

In order to evaluate the indicated derivatives, it is helpful to introduce some 
more notations. Let : 

(7.4) 

Zo.ifc — 23 filer > WOfik — 23 9ikr 

r r 

(7.5) 

ll,tk — T'fikr y = 2 ^ TQ t Jcr 

r r 

(7.6) 

da,jk s 23 23 V c ik/pr 

P r 

(7.7) 

dn,)k “ 23 23 vTGjk/pr — dn f i 

p r 

(7 8) 

&a,j =* 23 23 P Cjkfpr = 23 

fc p r A, 

(7 9) 

s et23 (U.ijh.tk + ^O.wTno.ifc) 

r * 

(7.10) 


(7.11) 

^ 2 ,t A, = rl 23 (^l,i j ^l,i k ,ifc)» 

T 4 


It is important to notice that do,,* = do,*, , but that fk di,*, , Similarly, 
/to.,* — ho,*, and / 12 ,,* s hs,*/, but hi,,* ^k hi,*,. 

To evaluate the derivatives of R, it is helpful to re-write the right members of 
(5.13) and (5 14) by means of (7.1), (7.4), and (7.5): 

(7-12) t t = jsS (£*hi,»* + Vkh^ik) 

r k 

2 \ 

U; = (f*Wo,,* + 77*77?!,.*) 

r k 


(7.13) 
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Differentiating the right member of (5.24) with respect to the £, and the tj, re- 
spectively with the aid of (7.12) and (7.13), and using (7.9), (7.10), and (7.11), 
yields 

bit 

(7.14) TT = 8X (hh f ,k + i?* hi,k,) 

dZ, k 

dR 

(7.15) T~ ~ §X tikfo.lk -f Vk 1*2,,*:). 

For the derivatives of }V, we re-write (5.25) using (7.1): 

(7.16) W = X X X X fe 4- Wh + h + r-nkf c,k/ P r ■ 

1 k p r 

Differentiating with respect to the and »/, respectively, we obtain, using (7.6), 
(7.7), and (7.8), 

(7.17) TT = Z>o,j + ’ll Di,, -f X) & + Tik di.jfe,)3 

df, i 

(7.18) ~ = 4f£,- Di,, + m Da,, 4- X ih d\.,k 4- Vk dn,,*)] . 

Ot}} k 

The stationary equations (7.2) and (7.3) can now be re-written, by means of 
(7.14), (7,15), (7.17), and (7.18) as: 

(7.19) X (£* ho,,* 4- i\k hi,b) — £-E 2 [£; D 0l , + + XI (£* rfo./i 4- >;* di,*;)] 

k k 

(7.20) X (ik hi,,k + tik hs.jk) = + ij, D 2 ,j + X (£*di,,t +• ij^dtu*)]. 

A; k 

These are the equations to be solved numerically for the maximizing £, and vi . 

Before showmg a procedure for the numerical solution, let us verify that a 
solution of (7.19) and (7.20) will satisfy (5.23). From (7.1), (5.17), and (7.8), 

(7.21) F=gX&ff»,i + ^A, l ). 

Summing both members of (7.19) over j shows that 

(1 ~ E 1 ) X (£*D<u + vkDi,k) ~ 0, 

k 

or, from (7.21), 

(1 - E^V == 0. 

Hence, if £ s 5^ 1, the corresponding solution will satisfy the condition that V = 0. 

As in the unrestricted case, there is always a trivial solution that will yield an 
E l formally equal to unity. This trivial solution is £, = 1, Vi = 0, which makes 
x >p - 1 as in the previous ease. These values satisfy (7.19) and (7.20), and 
have E = 1 , Of course, E is again not an actual correlation ratio for this trivial 
solution. 
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To obtain a non-trivial solution, it is convenient to write (7.19) and (7.20) in 
matric notation. Let 


(7.22) 


* = II feJ hi! II- 


z is a row vector of 2n elements, the first n elements being the £,■ and the last n 
elements being the 17, Let 


(7.23) 


[fio.ifc] 

[fit,!*] 

[Kk,] 



h is 2n X 2 n and is symmetric, in fact it is also Gramian, since its elements are 
product sums. Let S,k be Kronecker’s delta, and let 


(7.24) 


c = 


tAj, 3 <5,fc + da.jjt] [Z>i, 3 S 3 k + 

[Di.i Sjk + [D2,,&,k + du.jifc] 


c also is 2 n X 2 n, symmetric, and Gramian. Again let 

(7.25) X = \E\ 


Equations (7.19) and (7.20) can now be stated as a single matric equation: 

(7.26) zh — \zc. 


In general, c will be nonsingular, so that it will have an inverse by which both 
members of (7.26) can be postmultiplied to yield 

(7.27) zfic -1 = Xz. 

Therefore z is a latent vector of he -1 , and X is a latent root. Since we want the 
largest correlation ratio, we seek the largest of the non-trivial latent roots 
The largest root in practice will ordinarily be unique. There is then a unique 
latent vector corresponding to this root, and the elements of this vector provide 
the most consistent f, and rj, for the population in the sense of least squares. 

That c is Gramian and in general nonsingular, that the latent roots of Tic -1 
are all nonnegative, and that the latent vectors of fieri 1 are all real, requires only 
proofs analogous to those for the corresponding properties of A + C and h{A 
Cy 1 in the previous section, which need not be repeated here. 

As in the previous section, the final numerical steps can be carried out by 
iterations according to (7.27). Again, the trial vectors should be adjusted to 
conform to (5.23) to prevent convergence to the trivial solution. 

A marginal mdication of the consistency of the quantification is the agreement 
in sign of 

(V ~ ?)’?> + (r - s) vi, 

with 

e ilk/pr,qi 9»,j>r , 


for all comparisons. 
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Appendix: A distinction between the conventional principle and the present 
principle. The relationship between the conventional principle of estimating 
means of hypothetical distributions and the present principle of reproducing 
the comparisons of each individual will be analyzed here for the case of ordi- 
nary comparisons. Only the ■principles will be contrasted here. 

In the conventional approach, it is assumed that each of the N individuals 
has a numerical value for each of the 0 , . Let s<, be such a value of 0, for the 
ith individual. The hypothesis is that person i makes the judgment 0, > Oa if 
s<i > Sik ; and the conventional problem is to estimate from the judgments what 
the relative distances are between the means y . s , where 

(A.l) M) = jy 52 S tl * 

The ranges of the subscripts are : i = 1, 2, ■ • • , N) j, k, l = 1,2, • • ■ , a; and will 
not be explicitly indicated. 

According to the approach of this paper, if we are to consider hypothetical 
variables, the pioblem would be to determine for each 0, a numerical value z t 
such that the differences (x t — x k ) will best approximate the (a,-/ — a,-*) for each 
individual in the sense of least squares. This will separate “higher than” re- 
values from “lower than” re -values. If we let 

(A.2) Z - £ 52 £ l(a H - «.*) - Wi (x, - **)]* , 

* i b 

where w, is a constant of proportionality to be determined for each individual 
separately, then the problem is to determine the x t and the w, which will mini- 
mize Z. 

Differentiating Z with respect to the Wi and x / respectively, and setting the 
derivatives equal to zero, yields the stationary equations 

(A.3) 52 — St) — w ( (x, — 5)] = 0 

* 

(AA) 52 Or* - x)(®<* - w ( x k ) = 0 , 

k 

where 

(A 5) s. — ^ 52 s,k , x = - 52 a;* • 

U k Tl )t 

Smce Z is invariant with respect to translations of the re, (also to translations 
of the S(f), the origin of the re, is arbitrary, and there is no loss in generality in 
setting 

(A-6) x = 0. 

Then if we let 

t k 
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equations (A.3) and (A.4) can be re-written respectively as 


(A.8) 

E (so - = 

% 

(A.9) 

E = (3w, 


k 


By summing both members of (A.8) over j, we see that 
(A.10) aZ x, = 0. 

3 

Therefore, since in general a > 0, we must have x = 0 ; and a solution of (A.8) 
will necessarily be consistent with (A.6). 

Using (A.9) in (A.8) yields the stationary equations for the £,• alone: 

(A. 11) .E-/ Xk ^ ) StA;(Si 7 “ Si) a fix , . 

fc \ 

This shows that the x t are elements of a latent vector corresponding to a latent 
root a/3 of the n X n matrix defined by the elements S,k , where 

(A. 12) Sjk — ^ j Sifc(Si; — s,) = $,, s,n *— — ) ) ) ) Stjc Sii • 

t t Th l \ 

To determine which one of the latent roots provides the minimum Z, we first 
notice — by multiplying both members of (A.9) by w, , summing over and using 
(A.7) — that 

(A. 13) EEn sa wt = a/3. 

> h 

Then expanding the right member of (A.2) with the aid of (A.9) and (A.13), we 
obtain 

(A. 14) Z/2n = E E(s. ; ~ s<) 2 - o|3 . 

» J 

Clearly, Z will be minimized if we use the largest a/3. Therefore, we seek the 
latent vector associated with the largest latent root of |j S,k ||. 

To examine the relation of the elements of this minimizing latent vector to the 
means m of the hypothetical variables, denote the variances and correlations 
of the hypothetical variables by: 


(A. 15) 
(A. 16) 


= jj E (s.j ~ Mj ) 2 33 E s], - th 


E (S<1 — ~ fib) 

i 

AToy o> 


2 S »J s ‘b Mi fib 


a, a h 


Then 
(A. 17) 


Plb = 


E Susa =-= A(<r, <u p,i -f p, p fc ) . 



162 


LOUIS GUTTMAN 


From (A. 17) and the last member of (A.12), we can write 

(A.18) ^ S,k = o- 3 <n p 3 * + ix j toe ~ \ 'Ei °i Phi + M* Pi) • 

JV n 1 

The elementa of the matrix of which the Xj are a latent vector are now ex- 
pressed in terms of the means, variances, and correlations of the hypothetical 
variables, according to the right member of (A.18). It is clear that in general, 
the n, are not elements of a latent vector of || S,k ||, so that our approach is in 
general not equivalent to the conventional approach. 

In the special case of equal variances and correlations, such as is often as- 
sumed in the conventional approach/ we can now see that the n, do define a 
latent vector. For this case, let the common variance be <r 2 , and let the common 
correlation coefficient be p. Then 

(A, 19) Pifc = p + %(1 ~ p)> 

where && is Kronecker’s delta, and (A.18) becomes 

(A.20) i S ]k m <r 2 (l - p)(s, k - ^ + (ju, - fx)p k , 

where 

(A.21) p. = l Ei P, • 

n i 

From (A.20) and (A.12), (A.ll) becomes converted to 
(A.22) [7 — <r 2 (l — p)]b, = (p, — fi)E PhXk, 

k 

where 

(A.23) 7 = a f}/N . 

Multiplying both members of (A.22) by x t and summing over j shows that 

(A 24) (£ Pi x,f = $7 — cr z (l — p)j . 

1 

From (A, 22) and (A, 24) we obtain the elements of the minimizing latent vector 
for Z to be, in normalized form, 

(A.25) _?L = W ~ M 

V/3 Vy - «*(1 - p) ' 

That this is the minimizing vector follows from the fact that the remaining 
latent roots must all have 7 = <s (1 — p) in order to have vectors distinct from 
(A.25) ; (A.25) does correspond to the largest nontrivial root, since for it the 

4 More specifically, zero correlations are assumed, but this is not necessary for our 
purpose. 
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root satisfies the inequality 7 > a (1 - p). (The remaining latent vectors are 
not uniquely defined, for they all correspond to equal roots,) Therefore, the 
means of the hypothetical variables are a linear function of the elements of the 
minimizing latent vector for the case of equal variances and correlations 
As a final comment, it ‘should be pointed out that paired comparisons are 
insufficient to estimate the hypothetical values. Two persons with widely 
different hypothetical values will make the same judgments provided only that 
their values have the same rank order Therefore, hypotheses about variables 
presumed to underlie the comparisons cannot be completely tested only on the 
basis of the comparisons 

Psychologically, it may or may not be proper to assume that judgments of the 
type 0, > Ok can be expressed as a function of differences s„ - s,*. Perhaps, 
psychologically, comparisons may operate on some more complicated principle. 
The approach presented in the body of this paper does not assume anything 
about underlying variables, but simply seeks a set of numerical values that will 
best help reproduce the observed data for each individual. 
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RELATIVE ACCURACY OF SYSTEMATIC AND STRATIFIED RANDOM 
SAMPLES FOR A CERTAIN CLASS OF POPULATIONS 1 

By W. G. Cochran 
Iowa State College 

1. S ummar y. A type of population frequently encountered in extensive 
samplings is one in which the variance within a group of elements increases 
steadily as the size of the group increases. This class of populations may be 
represented by a model in which the elements are serially correlated, the correla- 
tion between two elements being a positive and monotone decreasing function 
of the distance apart of the elements. For populations of this type, the relative 
efficiencies are compared for a systematic sample of every fcth element, a stratified 
random sample with one element per stratum and a random sample. 

The stratified random sample is always at least as accurate on the average 
as the random sample and its relative efficiency is a monotone increasing function 
of the size of the sample No general result is valid for the relative efficiency of 
the systematic sample. In fact, there are populations in the class in which the 
systematic sample is more accurate than the stratified sample for one sampling 
rate, but is less accurate than the random sample for another sampling rate. 
If, however, the correlogram is in addition concave upwards, the systematic 
sample is on the average more accurate than the stratified sample for any size 
of sample 

Some numerical results are given for the cases in which the correlogram is (i) 
linear (ii) exponential 

2. Introduction. We consider a finite population consisting of the elements 
Xi , xi , ■ ■ ,x n/ c, where n and fc are integers. A systematic sample is drawn by 
choosing an element at random from the elements x t , • ■ ■ , Xk , and then selecting 
every fcth consecutive element. That is, if x f is the element first chosen, the 
systematic sample comprises the elements x< , x 4+t , ■ ■ • , x< + ( n _i)fc . This type 
of sample has found considerable use in practice, because it is often easier to 
select and to administer than a random or stratified random sample and because 
it has an intuitive appeal through spreading the sample evenly over the popula- 
tion Much remains to be learned, however, about the accuracy of this system- 
atic sample relative to that of comparable random or restricted random samples. 
Probably the most relevant comparison is that between the systematic sample 
and the stratified random sample having one element per stratum. In the latter 
case, the population is divided into the n strata {act , — , as*J, {x fc+v , • • • , 

i ' i and one element is chosen independently at random from each of the 
strata. This type of sample is similar in many respects to the systematic 

1 Journal paper No. J-1341 of the Iowa Agricultural Experiment Station, Ames, Iowa. 
Project 891. 
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sample. Both divide the population into the same n strata of k elements each, 
with one element chosen from each stratum. Moreover, neither sample provides 
the data for an unbiased estimate of the sampling variance of the sample mean, 
at least in the sense that the estimate is unbiased whatever the form of the 
population of elements x, . 

The first thorough investigation of the properties of systematic samples was 
made by W. G. and L. H Madow [1]. In particular, these authors compared 
the accuracies of a systematic sample and a stratified random sample of the types 
described above for several types of finite population. Where the elements in 
the population lie on the line x, = i, they showed that the stratified random 
sample, with one element per stratum, is more accurate than the systematic 
sample. If the population has a periodic distribution, the stratified random 
sample is superior when k is an integral multiple of the period, but the system- 
atic sample is superior when k is an odd multiple of the half -period. The authors 
also considered the more complex case where the population contains both a trend 
function and a periodic function. 

The object of this paper is to make similar comparisons for another type of 
population which appears to be fairly frequently encountered in extensive 
samplings The population is one in which the variance among the elements in 
any group of contiguous elements increases steadily as the size of the group 
increases. This type of population has long been regarded as applicable in field 
experimental work, where the variance among plots within a block is found 
usually to increase with the size of block. Summarizing data from 40 uniformity 
trials, Fairfield Smith [2] verified this notion and derived an empirical relation- 
ship from which the rate of increase may be estimated. The same type of popu- 
lation is also considered in several recent papers on extensive sample surveys. 
Thus, in a discussion of methods for sampling farm populations, Jessen [3] 
postulated a law in which the variance among farms within a grid is a monotone 
increasing function of the size of the grid and used the law for estimating the 
optimum number of farms which should be included in a sampling-unit. 
Mahalanobis [4] independently developed the same law as Fairfield Smith in a 
comprehensive investigation of large-scale sample surveys. Hansen and Hurwitz 
[5] referred to the increase in variance within a cluster with growing size of cluster 
as typical of many actual populations Numerous other references could be 
given. 

3. Specification of the population. Various mathematical models may be 
constructed to represent the situation in which the variance within any group 
increases with increasing size of group. For instance, ive might consider that 
the elements x, are drawn from different populations, the population changing in 
some regular manner with i. Alternatively, the Xf may be assumed to belong 
to the same population, but to be serially correlated. For simplicity, we assume 
further that the serial correlation between Xi and x l+u is some quantity p u which 
depends only on u. Then if p u is positive and is a monotone decreasing function 
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of u, it may be expected from intuition (and will be proved later) that the 
variance withm the group of elements x , , x, +1 , • ■ ■ , x t+ k is a monotone increasing 
function of k This model seems appropriate for our purpose, since many writers 
refer explicitly to positive correlations between the x’s as the basis for the 
phenomenon of increasmg variance. 

The specification above will be qualified in one respect. To assume that the 
p’s are strictly monotone for an actual finite population of only moderate size 
does not seem realistic While the correlogram may exhibit a definite downward 
trend, yet individual fluctuations about the trend prevent the correlogram from 
being strictly monotone. It is more reasonable to regard the finite population 
as being itself a sample from an infinite population in which the p’s are monotone. 
This attitude is, I believe, in accord with that of the authors referred to above, 
who, as I interpiet their writings, regard the variance law as holding in an ideal- 
ized population Thus, comparisons between the systematic and stratified ran- 
dom samples will be made not for a single finite population, but for the average of 
finite populations drawn, from an infinite population with monotone decreasing p. 
Results for an individual finite population will differ from the average results 
because the r’s which appeal in the population fluctuate about their expectations 
p. As the finite population becomes larger, its results will tend to coincide with 
the average results. 

Accordingly, the elements x , , i = 1, 2, • • • , nk, are assumed to be drawn from 
a population in which 

E(x,) - p, E(xi — p) z - cr 2 , E(x, — p)(x< + u — p) = p u cr J 
where p u > p„ > 0, whenever u < v. 


4. Some useful preliminary formulas. If x is the mean of a specified finite 
population, the following algebraic identity, frequently useful in the analysis of 
variance, is easily established. 

(1) (kn) X (%< - if = E E (z £ — Xjf. 

»=i i-i ,>» 

Since there are ( kn){kn — l)/2 possible pairs of values (x, , xj), this gives 

(2) Xj (*. - x? = E(x t - xtf = V E[(X{ _ m) _ (jB/ _ M) j* 


where E is taken over the finite population. Now expand the quadratic and 
average over all finite populations. In the (kn)(kn — l)/2 combinations, there 
are (kn — 1) in which j exceeds i by I , (kn — 2) in which j exceeds i by 2, and 
so on. Hence 


(3) 


k n 

E X (a\ — a) 2 = (kn — 1) o- 2 

i=l 



2 

(kn) (kn — 1) 


kn — 1 

X) (kn — 


u— 1 



To obtam the corresponding expectation for the sum of squares within a single 
stratum of k consecutive elements, we need only replace (kn) by lb in (3). Since 
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the result is the same for all n strata, we obtain 

/ o k ~ l 1 

(4) E (S S. within strata) = n(k — 1) a- 2 jl — - ^ 2 (k - 


Formula (3) also gives the expected sum of squares within a specified system- 
atic sample if we replace ( kn ) by n and u by (/hi), since there are n elements in 
the sample and since the correlations between successive elements are ph , P 21 , 
instead of pi , pi , ■ • ■ The result is the same for each of the k systematic 
samples. Hence 


(5) 


E (S. S. within systematic samples) = k(n — 1) <r 2 1 1 — 


\ n (n — 1) 


n—i 

■ E 


(n - u) ptu 


}■ 


6. Average variance for a random sample. The symbols o> , , <?lv will be 

used to denote the average variances of the means of the random, stratified ran- 
dom and systematic samples, respectively, about the mean of the finite popula- 
tion, this average being taken over all finite populations drawn from the infinite 
population specified in the previous section Comparisons with the random 
sample, though not our mam purpose, will be included where they are of interest. 

For a single finite population, it has been shown by several writers that the 
variance of the mean of a random sample is 


(6) 


1 . (kn ~ rc) . J_ y f _ s y 
n (kn - 1) kn fA K ’ 


where x is the mean of the finite population. 
From (3), we obtain 


(7) 



(kn) (kn 


k 71 — 1 ^ 

— ^ E ( kn - «) puj. 


6. Average variance for a stratified random sample. If x, t is the mean of a 
typical stratified random sample, the sampling variance of x, t is by definition 

(8) E(x, t - xf. 

Consider first the average over a single finite population. Let ii , Xt , • • • , x n 
be the means of the n strata, respectively, and let Xi, , Xu , • ■ ■ , x n ; be the ele- 
ments selected from the respective strata Then (8) may be written 

(9) — j E [ ( x n — ^i) + ~ # 2 ) + ■ ■ • + (a’tvy — x n ) }* 

IX 

n n 

y. x tJ = nx, t and E ^ = nx - 


since 
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Take the average over all k n samples from the finite population. All cross- 
product terms vanish, since, for example, Xi 3 appears equally often with , 

• ■ • , x ik . This gives 

0°) 5 S (x “ ~ w ‘ 

for the variance for a smgle finite population. The sum of squares involved is, 
of course, simply the sum of squares within strata. Hence, by (4) 



7. Average variance for the systematic sample. If x, y is the mean of a typical 
sample, the variance for a single finite population is 

(12) E (x. v - xf = 1 { m 2 (x, v - £) 2 } 

where the sum is taken over the k systematic samples. Since the sum of squares 
among samples is equal to the total sum of squares in the population minus the 
sum of squares within samples, (12) equals 

1 fcn j 

(13) 7- Z (*• “ 5 ) 2 ~ (S. S. within systematic samples). 

kn >=i kn 


To obtain the average over all finite populations we substitute from (3) and 
(5) for the first and second terms respectively. The result is 


(U) <4 = 


2 _ (kn — 1) 2 /. 


hi 


This reduces to 


k n— I 


V 


(kn) (kn - 1) 5 {kU U)(>u 


(n - 1) 2 

<J 


1 - 


n(n 


Z (n ~ w)/>*uj. 


(15) 



2 

kn(k — 1) 


kn — 1 

z 


11 “1 


(kn — u) p u 


+ n(k- T) 5 (n U)pku }’ 
It should be noted that the formulas and notations above are different from 
those used by the Madows, who define p and o' with reference to a single finite 
population and discuss the sample variances for a single finite population. 


8. Relative accuracies of random and stratified random samples. First, some 
general comments From (7), (11) and (15) the relative efficiencies of the three 
types of sample are seen to depend only on the linear functions of the p’s which 
appear in a , , a st , and cr sy . It is easy to verify that in each case the sum of the 
coefficients of the p’s is unity. For the random sample, the linear function in- 
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volves every serial correlation up to lag (kn — 1) with, coefficients which decrease 
linearly as the lag increases and are independent of the size of sample, depending 
only on N = (kn), the number of elements in the finite population. For the 
stratified random sample, only serial correlations with lags up to (k — 1) appear, 
k bemg the number of elements in the stratum. As presented in (15), the 
formula for the systematic sample is separated into two linear functions. The 
first is the same function as appears in the formula for the random sample except 
that all coefficients are (kn — 1 )/(k — 1) times as large The second, which 
carries a positive sign, involves correlations where the lag is a multiple of k. 

Thus far the formulae require no restrictions on the p’s In considering the 
case where the p’s are positive and monotone decreasing, the following lemma is 
helpful. 

Lemma. If p, , (i — 1, • • • , m), are positive and monotone decreasing, that is, 
p. > Pi+i > 0 and if (a, + a 2 -f • • ■ + a m ) is zero, the necessary and sufficient 
conditions that 

(16) L = aipi + or 2 p 2 + ■ • • + a m p m > 0, for all admissible sets of p's, 

(17) are ai + a 2 + • ■ • + a. > 0, i = 1, 2, • • • , (m — 1). 

For let p x = p,.|i + 5. , where by hypothesis 5, > 0. Then if we substitute 

successively for pi , p 2 , • • • , p m _i in terms of fii , , • • • , 5m-i , we find 

(18) L = ai<h + (m + OC 2 )&2 + (<*1 + “2 + “3 )Sg + • • • 

+ (“1 + «2 + • • • 4- , 

the final term in p m vanishing because (on + • • • + « m ) is zero. Since all 5, > 0, 
the sufficiency of (17) is obvious. Also, if for any i the coefficient of 8, is negative, 
we can make L negative by choosing that 5,- as positive and all other S’s as zero. 
This establishes necessity. 

Corollary. If p, are strongly monotone, i.e., p, > p, + i , and if at least one of 
the a, is different from zero, conditions (17) ore sufficient to establish that L exceeds 
zero. For in (18) all the S’s are greater than zero and by (17) none of the S’s has 
a negative coefficient. Further, the coefficient of at least one of the S's must 
exceed zero, otherwise all the a’s would be zero. Hence L > 0. 

We now show that if the p u are monotone decreasing, 

(19) L(fc) = k(k - 1) S (fc - U)pu 

is a monotone decreasing function of k. This is the linear function which appears 
in the variance of the stratified sample. 

(20) L(k) - L(k + 1) = Y) § ~ (fcTfljfc Z (* + 1 “ w )p« 

2 * 

k(k 2 - 1) £< fc+1 - 2u)pu • 


( 21 ) 
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Since the sums of the coefficients of the p u are unity in L(k) and L(k + 1), 
the sum is zero in (21). Hence the lemma may be applied. But it is obvious 
that the sum of the first i coefficients in (21) exceeds zero, since the coefficients 
are all positive for «<(/:+ l)/2 and all negative for u > (fc + l)/2. Hence 

(22) L{k) - L(k + 1) > 0. 


Further, by the corollary, if the p u are strongly monotone, L(fc) is strongly mono- 
tone Since all p„ are positive, this result is sufficient to prove that 


< 23 > 


2 nJfc— 1 

(nk)(nk - 1) £ {nk ~ u)pu ■ 


Consequently, for any size of sample the average variance of the stratified sample 
cannot exceed that of the random sample. Further, the relative efficiency of the 
stratified sample to the random sample is monotone increasing with decreasing 
size of stratum, i e. with increasing size of sample. There is, of course, nothing 
unexpected in these results. Equation (22) also establishes the result mentioned 
in the thiid section, that with monotone decreasing p, the average variance with- 
in strata increases steadily as the size of stratum increases. For if n{k — 1) de- 
grees of freedom are assigned to the sum of squares within strata, formula (4) 
above shows that the average variance within strata is 

(24) <r, { 1 “ k(k - 1) £ ( h ~ w)p “} " ) ■ 

9. Comparison of the systematic and random samples. Upon investigation, 
it is soon evident that no general results can be established about the efficiency 
of the systematic sample relative to the random samples, unless further restric- 
tions are made on the form of the population. In order to apply the lemma, we 
find the sums of the first i coefficients of the linear functions of p which appear 
in the variance formulae (7), (11) and (15) By elementary methods these sums 
are found to be 


(25) 




i{2nk — i — 1) 
nk(nk — 1) 




i(2k - i - 1) 
k(k - 1) ’ 


1 


1 < i < (ft - 1) 
i > k. 


T.. = i( - 2nk ~ 1 ~ _ rk(2n - r - 1) 

nk{k - 1) n(Jfe - 1) ’ 

where r is the integer such that (r + l)k > i > rk. 

thntT^ 1 * e ^ en ? ma ' ™ or< ^® r to establish a\ y < o-J, , it would be necessary to show 
that for any .. Now if i is less than k, so that r is zero, clearly 
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(26) > £.1 > £r , * - 1, 2, ■••,(* “ 1). 

except when n is 1, in which case all three are equal. 

But if i is an integral multiple of k, say rk, we find 

m + 2-- 1 ' S.-:- 

so that 

(28) >Hr> ■ 

Consequently the conditions of the lemma are not satisfied with regard to the 
systematic sample and no general theorem exists for all populations with mono- 
tone decreasing p. The result (26) and the corollary show that for any popula- 
tion in this class which has p u = 0, u > (k — 1), the systematic sample is more 
efficient than the stratified random sample. On the other hand, (28) shows that 
in a population with the first k of the p’s equal and the rest zero, the systematic 
sample has a higher variance than a random sample. If these two results are 
collated for a population with the first j of the p’s equal and the rest zero, we see 
that the systematic sample with stratum size j is less accurate than the compar- 
able random sample, while the systematic sample with stratum size (j 4- 1) is 
more accurate than the comparable stratified random sample. Although such 
a population may not occur in practice, the result suggests that the graph of the 
variance of the mean against the size of sample is unlikely to exhibit the same 
regularity for the systematic as for the random samples. 

10. Populations in which the correlogram is concave upwards. Further 
investigation shows that the deciding factors in determining the relative accura- 
cies of the systematic and random samples are the second differences of the p„ 
rather than the first differences. The following result will be proved. 

Theorem : For all infinite populations in which 

P» > Pi+i > 0, i = 1, 2, • • • , (kn — 1), 
and 

Si = pi— i + p«+i — 2 pi > 0, i = 2, 3, • ■ • , (fin — 2), 

then 

2 2 . 2 

& &y 0" r 

for any size of sample Further, o] u < oh , unless S 2 { = 0, i = 2, 3, • • • , {fin — 2). 

This result can be proved by expressmg the linear functions of the p u in terms 
of second differences and establishing a new lemma applicable to second differ- 
ences. An alternative approach is simpler and perhaps more instructive. 

Since the p u are monotone decreasing, oh < o~ t by the results in section 8. In 
(13) above, the variance of the mean of a systematic sample for a specified finite 
population was expressed as 
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1 kn 1 

i£(i, - xf - r- (Total S.S. within systematic samples) 

' ICfl 


(29) 


kn 7- 


1 * n 1 

= _ V (xi — xf (Average S.S. within a systematic sample). 

kntt n 

A corresponding equation holds for stratified random samples. For if xy \ 
Xi, , ■■■ , x n j are the elements of any stratified random sample with mean 2., 


(30) 


Jh (x<j — xf = Jf, {x,i - £,if + n(£,i — xf. 


i-i 


Now take the average over all k n samples. This gives 


l * n 

(31) - 52 ( x > ~ %f = (Average S.S. within samples) + nE(x„ — xf, 

k i-i 


Since the term on the extreme right is n times the variance of the stratified 
random sample, a result analogous to (29) follows at once. 

Consequently, <r\ v < ah if the average sum of squares mthin a systematic 
sample iB greater than or equal to that within a stratified random sample. Now 
by (2), with n in place of (kn), each of these averages is equal to 


(32) 


E (x ti - x ls f 


where , x u are the elements in the sample from the tth and the ith strata 
respectively, the average being taken over all possible pairs of strata. 

We consider a fixed pair of strata and let 1 — t = u. For the systematic 
sample, corresponding elements in the ith and 1th strata are always (fcu) elements 
apart. Hence, 

( 33 ) E, y (xtj iij) = 2 cr (1 — pin). 

For the stratified random sample, there are k? possible pairs of elements from 
the two strata. One pair is (ku — k + 1) elements apart, two pairs are 
(fctt — k + 2) elements apart, and so on, the numbers of pairs rising linearly to 
k and then decreasing linearly to one for the final pair which are (ku + k — 1) 
elements apart. This gives 

( 34 ) E. t (x tl - x,jf = 2 <f |l - i (k - \i\ ) Pku . 

Hence, to complete the proof that <r* y < a., , it is sufficient to show that 
c*-n 

(35) 52 (k — |i|)p»*k — fc * puu > 0 

(t-i) 

for u = 1, 2, • ■ • , (n — 1), that is, for any pair of strata. This may be written 

(*-i> 

(3®) H (k — i)(p*»H + pin-t — 2pt«) > 0. 

i—l 
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But if 5/s-u — pku_i "1" Pfcu-fi 
show that 


2p tu is the second central difference it is easy to 


(*-») 

(37) phu f v + pku-i — 2piu = 2 (® ~ lil )**«+/ > 0, 

i — C»— 1) 

since by hypothesis 5* > 0, j = 2, 3, ■ • • , (fcn — 2). This proves that the 
variance between the elements of the systematic sample is greater than or equal 
to that between the elements of the stratified random sample for any fixed pair 
of strata. The result for the overall average follows. Hence < a ] t . 
Further, unless a] = 0, for all j, clearly a\ v < ah , except for samples of one. 

The essential point in the proof may be put as follows. The elements in the 
fth and 1th strata are on the average (leu) elements apart for both the systematic 
and the stratified random sample. When two elements in the latter sample are 
(ku + i) elements apart, they are less correlated than on the average, since 
Pk u+i < pa u , and thus provide more independent information. The vari- 
ance between the elements exceeds the systematic sample variance by 
2 o- 2 (paju — pau+ 0 . However, such cases are counterbalanced by an equal num- 
ber of cases in which the elements differ by (fcu — i) and the variance is below 
the systematic sample variance by 2a- 2 (p J( , u _ 1 — p ku ). Because of the concavity 
of p u , the losses on the average balance or outweigh the gains 
For the population discussed in section 9, in which p« = p, u = 1, 2, ■ • • , j t 
p„ = 0, u > j, we have <5,- < 0, 5*+i > 0, and <5„ = 0 otherwise. This reversal 
of the sign of the second difference is the explanation for the anomalous behavior 
of the systematic samples with stratum sizes j and (j T 1). 

The theorem above does not prove that the relative accuracy of the systematic 
to the stratified random sample is a monotone function of n, nor even that <t 2 , v 
decreases steadily as n increases. Actually, there are populations in the class for 
which neither result holds, as will be illustrated in the next section. 

So far as practical applications are concerned, the restriction that the p u should 
be concave upwards may not be severe. For instance, this condition is satisfied 
when the correlogram is linear, i e. p u = (l — u)/l, this being one type of correlo- 
gram which Wold [6] has considered applicable to economic data. Concavity 
also holds for the function p n = e x “ which Osbome [7] has suggested for forestry 
and land-use surveys and for the relation p u = tanh (u~ 3/6 ) which Fisher and 
Mackenzie [8] used for expressing the correlation between the weekly rain at 
two weather stations as a function of their distance apart. In fact, if p u is 
conceived of as positive and continuous for all w, a concave upwards function 
suggests itself naturally. 


11. Linear correlograms. It may be of interest to present some results ob- 
tained when the correlogram is (i) linear, (ii) exponential, since both types have 
been suggested as possible models for populations occurring in practice. 

In the linear case, 
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(38) pu = (L - u)/L, u < L\ P„ = 0, u > L. 

If L > {nk - 1), the correlogram is a straight line throughout the whole range 
of the finite population. Since all second differences are zero in this case, we may 
expect a] y - u si < a, . If L < ( nk - 1), all second differences vanish except 
gi , which is positive. Hence w,e may expect a\ v < tr] t < <r i . 

The results for these cases are found by elementary summations from the basic 
formulae (7), (11) and (15). Details of the summations will not be presented. 
For L > {nk — 1), we find 


(39) 


~ = n 0 fc) 


(k+1) 


2 

Cr 


- - (i - 1 
n \ k 


\{nk + 1) 


3 L ’ vr kj 3 L ' 

The ratio a,/ a, t is {nk + 1 )/(fe + 1), which is approximately equal to n, the size 
of sample, unless the percentage sampled is large. Thus very large gainB in 
efficiency over random sampling are obtained. 

If £ < {nk — 1), the formulae are less simple. Consider first k > L; that is, 
cases where the percentage sampled is less than lOO/L. If N — nk , 

1 \ jmN -L)+ {L % - 1 )) 
k) \ 3 N{N - 1 ) 

l^|3fc(* -L)_+ (tf ~ 1)1 


(40) 


(41) 


2 


2 

Off 


(42) 


2 

<r,y 


-q 

n \ 
n \ 

= <?( _ 1 \jW(k - L) + {l? - 1 ) 
n\ k, 


3 k{k - 1) 


k > L 




3 N(k ~ 1) 


k > l. 


It is clear on inspection that ; moreover, it is easy to show that the 

efficiency of systematic relative to stratified random sampling increases steadily 
as the size of sample increases. 

When the size of sample is increased further so that k < L, formula (40) 
remains unchanged, while <r’< is now given by the same formula as in (39). The 
formula for a\ v is more complex. If q is the integral part of the quotient when L 
is divided by k and r is the remainder, so that L = [qk + r), the formula may be 
written 


2 


(420 


-SO-*) 


1) 


k < L. 


jqk(tf - 1) + 3rfc(n - q)(k - r) + r(r s 
\ 3 NL{k - 1) 

It is noteworthy that the last two terms in the numerator inside the curly 
bracket vanish whenever L is exactly divisible by k. Further, the second term is 
of order nk = N and, when present, exerts a much greater weight than the first 
term. Thus c, v takes a sudden dip whenever L is a multiple of k. In fact, for 
L = qk, (420 reduces to 


(43) 


= <l( x _ i\ (fe + l) 

« V h 


k) 3 N 


L = qk, 
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so that the variance goes to zero if N is sufficiently large. By comparison with 
formula (39) for <r 2 , t we see that when L = qk the relative efficiency of systematic 
to stratified random sampling is N/L, which increases beyond bound if N is 
sufficiently large. In intermediate cases, when the remainder t docs not vanish, 
the leading term in the relative efficiency for N large is (fc s ~ l)/3r(fc - r). 
This varies somewhat irregularly, depending on the relation between L and k, 

To illustrate, numerical values are given below when L » 10 and the finite 
population is large enough so that terms in 1 /tv are negligible. 

The quantities y, ( , v, v are the corresponding variances apart from ft factor 
o 2 / N . The stratified sample variance decreases steadily with increasing per- 
centage sampled. On the other hand the systematic sample variance goes to 
zero and the relative efficiency to infinity when fc is 2, 5 or 10. Moreover, in the 
intermediate cases k = 3, 4, 6, 7, 8, 9, the variance and the relative efficiency 
show no consistent relation to the percentage sampled. For samples of less than 
10 per cent, including the cases outside the limits of the table, the relative 
efficiency decreases steadily from 4 at fc = 11 to 1 when fc is large. 


TABLE 1 


V ariances except for a factor a 1 / N and relative efficiency for systematic and stratified 
random samples for a linear correlogram 


fc 

2 

3 

4 

5 . 

6 

i 

7 

8 

9 

10 

11 

20 

% Sampled 

50 

33 

25 

20 

17 

14 

12 

11 

10 

9 

5 

V.t 

.10 

.27 

.50 

.80 

1.17 

1.60 

2.10 

2.67 

3.30 

4.00 

11.65 

V.y 

0 

.20 

.40 

0 

.80 

i 

1.20 

1.20 

.80 

0 

1.00 

10.00 

V.t/v, y 

OO 

| 1.33 

1.25 

OO 

1.46 

1.33 

1.75 

3.33 

00 

4.00 

1.16 


12. Exponential correlograms. For the exponential p u = e~ x “ the results are 
much more regular Each of the linear functions of the p’s consists of a. finite 
number of terms of an expansion of the form (1 — x)~ 2 . If 

L)e x _ JV + 

(«* ~ l ) 5 


(44) 


m, x) 

2 

N(N - 


(IV 

which 

is the 

s sum for o> 

, we find 



(45) 

2 

Vy 

- iU - 

n \ 


m, 

Ml 

(46) 

2 

O-st 

= ^(l- 
n\ 


f(k , X)) 

(47) 

2 

O'ay 

= *7l- 

i)h- 

(N - 

■ 1) 



n V 

fc/\ 

(fc - 

1) 




m, x) + 

(fc - 1) 


fin, fcX) 


'}■ 
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It may be shown that the variance of the systematic sample decreases steadily 
and its efficiency relative to stratified sampling increases steadily as the sample 
becomes larger, 

In order to obtain some idea of the magnitude of the gain in efficiency, consider 
the case where k and n are large. For this case the relative efficiency, which 
actually is a function of k, n and X, turns out to depend almost entirely on the 
single quantity or, equally, on the correlation e kf ' between the items in 
successive strata in the systematic sample. If t = (fcX), we obtain a, = c 2 /n, 


(48) 


(49) 


2 

V it 


n 


n 


1 - + - 


i ' i 2 


2e~ 
t 2 


, _2 

t + (e' - 1 )J ‘ 


\ 


The relative efficiency is given in Table 2 for a selection of values of e~\ the 
correlation between the items in successive strata. 

The relative efficiency has a limiting value 2 when p tends to 1 and decreases 
slowly towards 1 as p falls to zero. The gams in efficiency are quite substantial 
if p exceeds 0.1. 


TABLE 2 


Relative efficiency of systematic and stratified random samples for an exponential 

correlogram 



N 

D 

H 

n 

D 

mm 

.3 

MM 

MU 



m 


m 

m 

m 

1.55 

B9 

B3 


It was pointed out in section 1 that no unbiased estimate of error is available 
from a single sample for either the systematic or the stratified random sample. 
This does not mean that no estimate of error can be attempted. However, any 
estimate must depend on certain assumptions about the form of the population 
which is being sampled and is likely to be vitiated insofar as these assumptions 
are false. If, for instance, the correlogram were assumed to be exponential, 
formula (47), or (49) in the particular case with n, k large, would appear to be 
the appropriate basis for the estimation of error from a single systematic sample. 
Consider the simpler case in which (49) is valid. The correlation between 
successive items in the systematic sample provides an estimate of e~‘ and hence 
of t Also, if terms in l/?i are negligible, the mean square within the systematic 
sample is found to be an unbiased estimate of v s . By substitution in (49) a 
consistent estimate of the variance of a single systematic sample would be secured, 
provided that the exponential assumption were correct. The gains in efficiency 
over stratified and random sampling could also be estimated. 
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OPERATING CHARACTERISTICS FOR THE COMMON STATISTICAL 
TESTS OF SIGNIFICANCE 

By Charles D. Ferris, Frank E Grubbs, Chalmers L, Weaver 
Ballistic Research Laboratory , Aberdeen Proving Ground 

1. Summary. Methods making possible quick calculation of operating char- 
acteristics or power curves of common tests of significance involving the x 2 , 
F, t, and normal distributions are presented. In addition, a comprehensive set 
of curves illustrating graphically the power of each test for the 5% significance 
level are included We are interested in the power of: (1) the x 2 -test to deter- 
mine whether an unknown population standard deviation is greater or less than a 
standard value, (2) the F test to determine whether one unknown population 
standard deviation is greater than another (one-sided alternative), and (3) the 
t - test and normal test to determine whether an unknown population mean 
differs from a standard or two unknown population means differ from each other. 
Such operating characteristics have application for the quality control engineer 
and statistician m the design of sampling inspection plans using variables where 
they may be used to determine the sample size that will guarantee a specified 
consumer’s and producer’s risk. On the other hand they arc of use in displaying 
the power of a test if the sample size has already been set. Finally, they are a 
necessary adjunct to the proper interpretation of the common tests of significance, 

2. Introduction. In the application of the common statistical tests of sig- 
nificance there has been a great need for readily accessible information on the 
power of the test employed to distinguish between the null hypothesis and perti- 
nent alternative hypotheses for given sample size. In this connection, two im- 
portant applications arise. On one hand it becomes important for tho sampler 
to know, for a given sample size and critical region, something about the power 
of the test m rejecting the stated hypothesis when some alternative hypothesis is 
true. On the other hand, if the sampler wants a given degree of assurance in 
rejecting the null hypothesis when a particular alternative is true, he would like 
to know the minimum, sample size which would accomplish this when tho prob- 
ability of rejecting the null hypothesis when true is given. In particular, the 
need for such information arises most frequently in setting sample sizes to dis- 
tinguish effectively, on the basis of single sample results, between (1) population 
standard deviations and (2) population means. If tho sample size has already 
been set, as is the case with most specifications, quick information on whether 
or not it is large enough to keep the risk of accepting poor material down to a 
reasonable figure is highly desirable. Such probabilities will be recognized, of 
course, as the Type I and Type II errors of the Neyman-Pearson theory. Such 
risks must be given proper consideration in the interpretation of a significance 
test or in designing the provisions of an acceptance test, 
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Needless to say, the appropriate expressions for the power functions of the 
X -test, F-test, normal-test, and i-test have been derived at one time or another 
in the literatuer. However, insofar as the practical statistician or quality con- 
trol engineer is concerned, such information has not been employed to advantage 
widely since no informative graphs or extensive tables of power functions for the 
common statistical tests of significance have been presented. Due to the prac- 
tical importance of questions of this type, the authors believe there is need for 
operating characteristics or graphical power functions of the common statistical 
tests of significance. This paper supplies such a need over a useful range of 
sample sizes and alternative hypotheses for the 5% significance level. 


3. Definitions. In the following account, we will refer to one or both of the 
normal populations, iri and n . We will let Xl be a variate from r L whose expected 
value or mean is mi and standard deviation <n . By n i we will mean the number 
of observations drawn at random from in and our sample statistics will be 
defined in the usual fashion: 

TJ l , »1 

xi = £ xi/ni , si = 2 (xi - 2i)V( n i — 1). 

i i 

Similar definitions apply to the normal population m with the appropriate 
subscript for sample statistics and population values. In dealing with a single 
population we will drop the subscripts from the sample statistics. 

We also define 


<r = a standard or arbitrary value of the standard deviation, 
a — a standard or given level, 

(Xi — Xi) J + £ (X t —Xif 


J _ 1 
Sl2 — 


7l\ -|- 712 — ’ 2 


when two normal populations 
are encountered. 


Ho will be used to denote the null hypothesis and Hi any one of a set of alter- 
native hypotheses. The probability of rejecting the null hypothesis Ho when 
it is true (Type I error) will be denoted by a, and the probability of accepting the 
null hypothesis when some alternative hypothesis Hi is true (Type II error) 
will be denoted by (3. 


y O 

4. Power function of the The statistic x 2 — — — ^ (dropping 

subscripts of sample statistics) is used to accept or reject the hypothesis that the 
standard deviation, <r\ , of the normal population sampled is some specified or 
given value, a. 

Our hypotheses are 


Ho'. <j\ = a 

Hi : cri = A<r, (X > 0). 
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A. To determine whether or not <n > a. We choose a significance level, a, 
and compute % = ^ ~jP '~ ' If X s > xl , where the percentage point a is 
determined by 


( 1 ) 



U in-m e -un 


du = ct 


we reject Ho and conclude that <n > a. 

To set up the power function we note that: 
If Ho is true 


Pr 


(n — l)a ! 


> x. 


= a 


If Hi is true 
Pr 

However, since 


(n — l)s 2 

? 


Pr 


> x*- - 1 - A 


(n — l)s - * \ , 

> Xi-fl) = 1 


(1 - 0 - a, if X « j). 


or 

Pr| (n ~ 1)82 > X 2 X ^| =1-/3 

we have the relation 

X l xU = X* or X = 

Therefore, for a given significance level, a (Type I error), and various Type II 
errors, /9, we can make use of the Tables of Percentage Points of the x*-distribu- 
tion [1] and compute enough of the points (X, 0) to plot the power curves de- 
picted in Fig. 1. The Type I error, a, has been set at the practical level of .05 
for Fig. 1. 

B. To detect <ri < V. We compute 

* = ( n ~ jV 

A- n 

a* 

and if x < x?-a we reject' if 0 , concluding that <rj < <r. 

By reasoning similar to that in A, we arrive at the relationship 



Again, by use of the Table of Percentage Points of the ^-Distribution the operat- 
ing characteristics of Fig. 2 are obtained. We have chosen the practical level of 
a = .05 for Fig. 2. 




181 



Ffc. 1, Operating Characteristics op the x 2 -Test [ x 2 = ~ roa Testing <n = a against <ti 


182 


C. D. FERRIS, F. E. GRUBBS, AND C. L. WEAVER 


Example-} A Rifle Association is purchasing small arms ammunition for 
match purposes It is the desire of the rifle club that the dispersion in muzzle 
velocity of a lot of ammunition intended for match purposes be kept down to a 
practical minimum. Acceptance or rejection of an ammunition lot must, of 
course, he made on a sampling basis since the ballistic acceptance test is de- 
structive in nature Moreover, for practical reasons acceptance of a given lot 
is to be on the basis of a single sample. The Association specifies that they are 
not willing to accept more than 5% of the lots whose standard deviation in 
muzzle velocity is 6 ft./sec. The ammunition manufacturer agrees that he will 
accept these terms provided not more than 5% of the lots whose standard devia- 
tion in muzzle velocity is 4 ft./sec will be rejected. Under these agreements, 
it is desired to know what sample size is necessary to provide the stated assur- 
ances for the Rifle Association and the ammunition manufacturer. 

In this problem, a = .05, (3 = .05, and X = 1.5. Referring to Fig, 1, we 
find the required sample size is approximately 35. 

On the other hand, if a sample size had already been set, the appropriate 
curve in Fig. 1 could be examined to determine whether it provided sufficient 
protection against the acceptance of inferior ammunition. 


6. Power function of the F-test. In discussing the power function of the 
F-test we will focus our attention on the problem of comparing the standard 
deviations of two normal populations. 

A. To determine whether or not the standard deviation, cq , of one normal 
population is greater than the standard deviation, cx 2 , of another normal popula- 
tion. We choose a significance level, a, and compute F = s?/s? . If F > F a , 
where the percentage point F„ is determined by 


( 2 ) 


r[|(ai 4~ — 2 )] 

r[K«i - DMKn* - l)} 


(n, - (n 2 - l) ,(r, *~ 1) 



u' 




[(ni — l)u -f- du 




we conclude that <n > <r 2 . 
Our hypotheses are 


Ho: ffi = as 

Hi: <ri - X<r s , (X > 1). 

To set up the power function of the F-test we note that: 
If H 0 is true 


Prls\/ S \ > F„) - «, 


1 This example is used to illustrate the use of the power of the 
cated as a most powerful sampling techmque (See ref. [10]). 


x*-test and 


18 not advo- 
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If Hi is true 

Pr\s\/s\ > F„) = 1-/9, (1 - 0 = a if X = I ). 

However, since 

or 

Pr(s?/sl > X*Fi-,) =1-18, 

we have the relation X 2 Fi_0 = F a or X = 

Therefore, for a given Type I error, a, and various Type II errors, 0, we can 
make use of the Table of Percentage Points of the F-Distnbution [2] and com- 
pute sufficient points (X, 0) to plot the operating characteristics depicted in 
Figs. 3, 4, and 5 In these figures, a has been set at the practical level of ,05. 

It should be emphasized that the operating characteristics presented in this 
papier are applicable only when one is interested in the one-sided alternative that 
<n > oi and not tn < <r 2 . Under these circumstances, the exact formation of the 
F ratio will be set beforehand and will not depend upon test results (for example, 
placing the greatest mean square in the numerator). In those cases where one 
is interested in the two-sided alternative, a two-tail F-test such as described by 
H. Scheff6 [3] should be used. It is hoped that at a later date operating char- 
acteristics of such a test calculated in a manner similar to the example in [3] 
will be presented, 

Exam-pie: It became necessary for a manufacturer to make a choice between 
a new type casting and one produced under standard design practices. One of 
the bases of comparison was dispersion in tensile Btrength, It was considered 
that if the standard deviation of the standard casting were larger than the new 
type, definite preference should be given to the latter. When the question of a 
practical criterion for rejecting the standard casting was considered, it was 
decided that if its true standard deviation in tensile strength were actually 1$ 
times that of the new type there should be a 90% chance of rejection. It would 
be of little practical importance to detect any ratio less than 1^ in this particular 
case. It was also decided that the 5% significance level would suffice insofar 
as rejection of equal quality was concerned. A preliminary sample size of 20 
was selected, and the question arose as to how well a sample of this size gave the 
protection desired. 

The question can be answered immediately by reference to Fig. 3 (here sf 
is computed from the standard casting data, of course) where it is seen that a 
sample size of 20 will fail to detect the stated difference 47% of the time. In 
order to achieve the desired protection, it is seen at once from Fig. 3 that a 
sample size of over 50 wall be necessary. The exact sample size, determined 
■with the aid of the formulas above, is found to be 54. 
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Fig. 5. Operating Characteristics op the F-Test F = -- for Testing oj — against tri > a 2 
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B. Analysis of variance, We shall consider the analysis of variance layout 
where a sample of n items is drawn from each of m normal populations with 
common variance <r 2 , It is required to decide on the basis of the sample results 
whether or not there is any variation among the true means of the m normal 
populations sampled. 

Let x t , be the jth item drawn at random from the ith population, 

1 n ^ 

a. = - / . x,t, and a = — / x. . 

nfzi 

The F-test utilizes the comparison of the variation among the sample means 
(external variance) with that among the items within the samples (internal 
variance) in order to test the equality of population means by making use of the 
ratio 


a S (it ~ xf m(n — 1) 

£ ( i ~ z.) 2 (w - 1) 

».7 

If F > F „ , where F, is defined as in 5.A , we conclude that the population 
means are not equal. 

In our approach we will assume that the m true lot means represent a sample 
from a super-population, also tiormal, with variance equal to flV. Since the 
sampling variance of the means is a fn, the total variance among the sample 
means equals 

a fn + 0*<r * = XV/n, (X* = 1 + nlf). 

Hence, our hypotheses are 


Ho: 0 = 0 

Hu 0 > 0 . 

Since F/\ 2 follows the F-distribution with m — 1 and m(n — 1) degrees of 
freedom the operating characteristic, i.e. the probability for various 0 of accept- 
ing Ho , may be obtained from the curves already graphed by setting n t = m, 
no - nm — m + 1, and X 2 = 1 -f n0 2 . 

In the design of experiments when the number of populations is indefinite 
(for example, daily tests) and the total sample size mn is limited, the above 
procedure will enable one to determine what values of m and n give the most 
powerful operating characteristic for the given amount of sampling. For 
example, for mn = 24 operating characteristics for all possible pairings were 
computed and charted. They were observed to cross one another, each combi- 
nation in turn becoming most powerful for a given interval of 6. The following 
table gives the best pairings for various intervals of 0; 
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m 

2 

3 

4 
6 
8 

12 


n 

12 

8 

6 

4 

3 

2 


e 

00- 32 
.32- 60 
.60- .91 
91-1.37 
1.37-2.50 
2.50- 


In contrast to the above discussion, mention should be made of P. C. Tang’s 
approach [4] to the power function of the analysis of variance. The basic differ- 
ence lies in the method of expressing the alternative hypothesis. Tang expresses 
it in terms of the variance of a finite number of population means We express 
it in terms of normally distributed population means. We believe our approach 
has considerable practical value in control chart analyses where we are interested 
in the quality of the flow of production of a large number of lots. In addition, 
our approach obviates the difficulties imposed by the non-central x 2 -distnbution. 


6. Power function of the normal test. 


A. The statistic u = 


y/n(x — a) 


is used to accept or reject the hypothesis 


that the mean, g, of the normal population sampled, is some specified standard 
level, a, when the population standard deviation is known (for example, from 
past data). 

Our hypotheses are 


Ho', g = a 

Hi: | g — a | = Xoi , (X > 0). 

To test the hypothesis g = a, we choose a significance level, a, and compute u. 
If | u | > w« , where the percentage point, u a , is determined by 


( 3 ) 


1 


~sj 2v 






dx = 1 - a, 


we reject H 0 and conclude that ji ^ a. 

To set up the power function we note that: 
If H 0 is true 


Pr{ —u a < u < -\-u a ) = 1 — a 

If Hi is true 

7-Y j-« a < £? < u„j> = / 3, (1 — p = a if X = 0), 

= Pri^—Ua + X Vn < < u a + X \/n| 

1 g — a j 


where X = 
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In the latter expression the statistic 


ls normally distributed with 

<Tl 


zero mean and unit variance. The required probabilities are found easily from 
tables of areas under the normal frequency curve. By computing enough 
points (X, j8) the operating characteristics depicted in Fig. 6 were constructed. 

It should be noted that the /3 corresponding to a pair of values n' and X' may 
be obtained from any other operating characteristic by use of the relation X — 
X'Vn'/n For example, if it is desired to find the Type II error for a sample 
size of n' = 12 and X' = 1, select any operating characteristic, say for n = 3, 
as the reference curve, compute X = ly/ 12/3 = 2, and find from the curve for 
n = 3 that ft = .07 In Fig. 6, however, individual operating characteristics 
are plotted for convenience and to provide a picture of the comparative effi- 
ciency of various sample sizes. 

Example. Pressure-measuring instruments are being tested against a standard 
level It has been decided that instruments whose true mean reading is as 
much as 10 pounds per square inch away from the standard level should be 
rejected 95% of the time. On the other hand only 5% of instruments whose 
true mean reading equals that of the standard should be rejected From past 
data, it is known that all test instruments of the type being considered have a 
stable standard deviation of 5 psi. If rejection or acceptance is to occur on the 
basis of a single sample and the normal criterion of significance, what sample 
size should be chosen to accomplish this purpose? Referring to Fig. 6 with X = 
10/5 = 2 it is seen that a sample size of 4 provides the required assurance. 

B. In sampling two normal populations and ir 2 , the statistic 


y / a i/ni + C2/H2 

is used to accept or reject the hypothesis that hi = ■ For generality it will be 

assumed that the population standard deviations trj and a 2 may not be equal, 
although they are known accurately. 

Our hypotheses are 

H 0 : hi = ^ 

Hi: \ hi ~ Pa \ = Xen . 

Significance is determined in the same manner as in 5. A., and the power 
function is set up in identical fashion. The value (3 is found to be the area 
under the standardized normal curve between the abscissas. 

+ X Jr^r- 
y k*ni + m 


where <r 2 = k<n . The value of /3 may easily be read from Fig. 6 for any X', ni , 
th , and k by selecting the curve for a convenient sample size, n, on Fig. 6 and 
taking 


X = 





U] n 2 


7c 2 ni + n 2 ' 
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7. Power function of the <-test. 

A. The statistic l = — is used to accept or reject the hypothesis that 

s 

the mean, n, of the normal population sampled, is equal to some specified level, 
a, when the population standard deviation, <n , is unknown. 

Our hypotheses are 

Ho : n = a 

Hi : | p — a | = Xvi , (X > 0). 

In order to test the hypothesis n = a we choose a significance level, a, and com- 
pute the statistic ( = — . If j 1 1 > t a , where the percentage point, 

3 

t„ , is determined by 



we reject Ho and conclude that 
To set up the power function we note that: 
If Ho is true 


Pr{-f« < i < +f a ) = 1 - a. 

If H\ is true. 


Pr{— < t < t a \ = 0, 
However, we have the identity 

Pr{ + XV^ < < +tm 

L <n d 


(1 - (3 = tx if X = 0). 


d 


4- XV n 


= Pr[-t a < t < +/„} 


where X = 



Hence, for any fixed — , the above probability may be 

a i 


denoted by say h(s/a) or, using the notation of section 



evaluated as the area under the standardized normal curve between the abscissas 
indicated. Then 


where /(x“) is the probabdity density function of x for ?i — 1 degrees of freedom. 
This is one method of evaluating 0 and it was used for calculating the operating 
characteristics for n. < 5. 

It has been noted that such a formula had been employed by Neyman and 
Tokarska [6] in calculating Type II errors where only one tail of the <-curve is 
used as the region of rejection. Probabilities calculated in this manner are 
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provided by Neyman and Tokarska for degrees of freedom n = 1 to 30 and Type 
I errors of 01 and .05. As soon as the area in one tail of the non-central f-dis- 
tribution becomes negligible these curves are equivalent to the test treated 
herein ivith an a of .02 and 10 respectively. An idea of the critical values of X 
at which this occurs may be obtained from a table in a succeeding footnote in 
which they are quoted for a = .05. The values are surprisingly small, such that 
almost all of Neyman’s figures can be interpreted for a two-tail region of re- 
jection. 

Using C. C. Craig’s development of the non-central t [7] we obtain 2 


/3 = Pr 


_ ta < Vn(x - ju)An + Vn\ < 


s/ffi 


= e -^ s £ 
r-0 


(*nX*V r 

r! 


(r + l/2),i(»- 1); 


n — 1 + f; 


d 


where I(p, q&) represents the Incomplete-Beta Function Ratio [7]. This may 
be conveniently used for those values of n where the necessary values are obtain- 
able from Tables of the Incomplete-Beta Function ratio [8] and for small values 
of X where the above series converges rapidly. 

The method actually used for n > 4, however, made use of the tables pre- 
pared by Johnson and Welch [9]. Replacing their X by ir to avoid confusion 
with our notation, these tables give values of ir tabulated against /, f, and e such 
that 


Pr 


t - 


Z + 5 
y/ w 



t 


where 2 is a normally distributed variate with zero mean and unit variance, fw 
is distributed according to the x ! -distribution with / degrees of freedom, and 
S = k — T\/l + <2/2/. We want 


P = 1 — Pr{< < — t a ] — Pr{f > <„}, 

For those values of X and n for which Pr{< < — t a ) is negligible 5 we can, for 
any given e, take U, = t a and / = n — 1 and read % from the tables, then deter- 


5 It should be noted that Craig’s formula as published is in error m having i(r + 1) as 
the parameter in the incomplete beta function instead of r +- 
s ValueB of for which Pr[t < — { 05) = 005 are listed below. 

/ = n — 1 X 


4 

.34 

5 

.30 

6 

.27 

7 

.25 

8 

.23 

9 

216 

16 

.159 

36 

103 

144 

.051 

oo 

.000 
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muie 5 and finally X from the relation X — 5/-\/n After computing /3 = 1 — e, 
the point (X, ft) on the operating characteristic may be graphed. At the few 
places wheie P r{t < —t a ] is not negligible and /3 is needed we can for a given X 
take 

h - 5 

* ~ Vl + W 

and then by reading 7r for various values of e, f, k make an inverse interpolation 
for ethus setting values for Pr{i > — 1„] and Pr{/ > t a ] . Finally 

0 = Prjf > -t a \ - Prji > +f a }. 

It was found that for n > 10 a good approximation for computing operating 
characteristics is given by 

/3 = Pr{ — t a -f- X \/ n < t < + f„ -f- X's/n} 

in which the variable t is distributed as central t with n — 1 degrees of freedom. 
This formula proved to be quite useful in preparation of the operating character- 
istics for the i-test. 

Fig 7 presents operating characteristics of the 1-test calculated by these 
methods. It should be noted that in using the 1-test, alternative hypotheses 
are expressed as so many multiples of the unknown population stnndaid devia- 
tion away from the level stated m the null hypothesis In some applications 
the alternatives may be naturally so expressed. In many applications, how- 
ever, it may be desired to control the distance n — a regardless of the stand- 
ard deviation of the lot sampled. In this case, one could place confidence limits 
on the estimate of a, determine the X value corresponding to each estimate, and 
finally obtain limits on the sample sizes or risks involved * 

B. For the case of two normal populations, the statistic 

^ Si — S 2 

S12V1A11 + 1/712 

is used to accept or reject the hypothesis that gi = when the two normal 
population standard deviations are unknown but equal to say, m . 

Our hypotheses are 

Ha : gi = in 

Hi : | hi — hi | = Xo-i . 

Significance is determined in the same manner as in par 6. A., and, by reason- 
ing similar to that in the preceding section, it is found that /9 for a given X' can 
be read from Fig. 7 by taking 

X _ . / hi ni 

Vn y 711 + 712 

4 h w a test of this nature in which the power of the test depends only on the absolute 
value of the distance n — a see [10] 




_ r y/n(x — a)~[ 

Fig. 7. Operating Characteristics of the £-Test t = | for Testing p = a against p 5^ a 
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and ft = rii + n 2 — 1. Before a statistical test of this nature is applied the data 
should be examined to verify consistency with the assumption that ci = <r 2 

Example: An analysis of the difference in tensile strength between two types 
of castings is being conducted. A sample of 10 items is selected from each type 
of casting and the (-test employed to establish superiority of one over the other. 
Experience has shown that the variability in tensile strength for one type of 
casting is comparable to that of the other type. If a is set equal to .05, what 
percentage of the time would our significance test fail to detect a superiority of 
one standard deviation in tensile strength? n = 10 + 10 — 1 = 19 and X = 
.513. Referring to Fig. 7 for this X and n, it is seen that the percentage /9 is 
approximately 45. 

In this paper we have presented power curves or operating characteristics of 
the common significance tests employed but a single sample of items. The 
power of the tests obtained here does not represent the limit that can be obtained 
for the average amount of inspection performed, say, over many consecutive 
lots. Tests, sequential in character [11], have been shown to be much more 
efficient. Nevertheless, single sampling is often the only practical procedure 
available. Again, the data may be brought to the analyst as single sample 
results collected supplementary to other purposes or prescribed by a standard 
procedure. Finally, in performing a significance test, it is quite important to be 
able to give constructive advice when the data indicate practical differences 
although no statistical significance is found 6 

Although sequential tests using variables have been devised, no investigation 
of double sampling schemes for variables similar to the Dodge-Romig [12] 
plans for attributes has, as yet, been designed with the exception of [9]. It is 
believed, however, that such plans would have considerable application for 
industry in combining efficiency with practicability. 

The graphs of the operating characteristics in this report have been made by 
calculating a sufficient number of points to draw them in by use of French curves. 
Considering this method of plotting slight error should be allowed for in reading 
probabilities of acceptance from the graphs, especially where the curves are 
steeD. 
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MINIMAL VARIANCE AND ITS RELATION TO EFFICIENT 
MOMENT TESTS 

By J. R. Vatnsdal 
State College of Washington 

1, Summary. When, a curve is fitted to a set of data by moments, the usual 
procedure used in testing the hypothesis that the population is of the given form 
with the parameters as computed from the moments is to compare the higher 
moments with their expected values as determined by the hypothesis Gen- 
erally speaking, moments about the mean are computed although the reason for 
this is not clean To shed some light on this question, the sample given in the 
introduction is fitted to two curves. Moments about various points are com- 
pared with their expected values and the discrepancy in standard units ex- 
amined. This discrepancy is found to vary widely and to have a maximum. 
The notion of equivalent moment tests is introduced, and on this basis the most 
efficient moment test is defined in such a way that of all equivalent moment 
tests, this one is most likely to reject a false hypothesis. 

For any moment it is shown that there is a point about which its variance is a 
minimum. The conditions are found which determine the position of this point 
for second and third moments. It. is proved that for symmetrical populations 
the variance is minimal when the moments are computed about the mean of the 
population. If the population is an asymmetrical Pearson frequency function, 
it is proved that the point about which the third moment variance is minimal 
differs more from the mean than does the corresponding point for second mo- 
ments. The condition is pointed out for which this is true in the general case. 

The third and fourth standard semi-invariants of second moments of minimal 
variance are computed and compared to those of the second moment about the 
mean. The ratios of these are displayed for some populations to illustrate how 
this may be used to investigate when the approach to normality is more rapid 
in one case than m the other. Some examples are presented to contrast these 
and other tests. 

2. Introduction. In testing the hypothesis that a given set of observations 
is a random sample from a completely specified population (either a priori or 
specified by a consideration of the sample), generally the Chi-square test is 
applied or certain functions of the moments are compared with their expected 
values and the significance of their departure as determined by the hypothesis 
is examined. 

In the Neyman-Pearson theory it is required that the functional form be 
known. The hypothesis then is some statement concerning the parameters. 
The main principle there used is that the test used should be such that, while 
keeping the probability of rejecting the hypothesis when true at a certain sig- 
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nificance level, it will minimize the chance of accepting the hypothesis when 
some alternative is true. 

However, if the functional form is regarded as unknown, the alternative hypoth- 
eses are then usually unknown. The test then must be one that does not 
depend on alternatives In the light of incomplete knowledge of the distribu- 
tion of sample statistics, and since moments of moments are practically the 
only ones known, we shall here use the principle of comparing observed moments 
with their expected values. It is known that the distribution of moments in 
large samples is asymptotic to the normal distribution if the appropriate mo- 
ments of the population exist [1]. Here we shall confine ourselves to such 
populations and large samples. 

To introduce the idea which underlies the theory here presented, consider a 
simple example. Suppose a sample is given and the hypothesis is of the form 
f(x, d) with d = 0 O • Furthermore, suppose the first moment of the sample is 
equal to its expected value. If a second-moment test is used, this means that 
one computes the arithmetic mean of the squares of the deviations of the elements 
of the sample about some point, and compares this with the theoretical moment 
about the same point. Generally speaking, the point used is the mean of the 
population or the mean of the sample. However, the point may be chosen in 
any manner. For each such choice a test can be devised such that the prob- 
ability of rejecting the hypothesis when true is e. All such tests are called equiv- 
alent moment tests. Among these equivalent moment tests, one particular 
second-moment will have the minimal variance. This one is here called the 
most efficient moment test. 

This test has the property that the range of values of the second moment for 
which the hypothesis is accepted is as small as possible. Thus of all equivalent 
second-moment tests, this one is most likely to reject a false hypothesis. 

This idea may be easily extended to moments of higher order, in all of which 
the concept of minimal variance is fundamental. The point of view may be 
taken that the point about which the moments are computed should he such 
that the variance is a mimmumj or what is equivalent, the variance of moments 
about the origin is minimized by choosing the origin properly 

An example is here presented to bring this out more clearly. A sample of 
1,000 items is given and fitted by the first two moments to two different fre- 
quency functions (The sample items are not given here ; they are to be found 
in Tables for Statisticians [2]). The third and fourth moments have been 
computed and the discrepancies in standard units as determined by the 
hypotheses are exhibited in a table. 

This sample of 1,000 items considered as a sample from an infinite population 
has these moments: 

mi = 139.288 
mi = 19692.452 
m' 3 = 2827467 388 
mi = 412561061.04 
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By fitting the first two momenta of the sample to curve A, 

fl+1 

V = - X " /f“* 

V r(» + 1 ) * 

we get a = 0.4781516735 and n = 65.60079029; to curve B, 


V 


1 


o \/2 tt 

we get n = 139.288 and a = 291 305056. 

The discrepancy between the observed and theoretical rth moment about any 
point is measured by 


t = 


It 

m r 


tt 

Mr 




— Mr 


n 


in which m r is the rth moment of the sample of n about this point, and n' r ' is 
the rth moment of the population about the same point. 

The values of 1 1 | have been computed corresponding to various points for the 
third and fourth moments. These are exhibited in four tables, given below. 

Examination of the table for the discrepancy between the observed and theo- 
retical third moments for curve B, shows that when this moment is computed 
about 2 = 0, the hypothesis is accepted at the 1% level, this is also true for 2 
= 39.3, but for x — 139.3 the hypothesis would be rejected at that level. It 
is evident that some rule must be established to decide what point is to be used 
to make the test. 

If the curve is fitted by the first two moments the value ml - ^ is the same 
for every point. This is easily demonstrated, for if m" and /// are measured 
about a point h units to the right of the origin, ml = m' z - 3h?4 + 3 h 2 m[ - fi* 

™ d / 3 ,r 7, 3/1/12 ,+■ “ h *- Now, = M 2 and m[ = It follows 

that m 3 — ns — — fi 3 

The maximum value of | 1 1 is attained when the variance of third moments is 
a minimum. In this manner it is assured that the range of values for which 
the third moment is accepted shalL be a minimum. 

If the third moments agree, or the agreement is sufficiently close such that the 
ypothesis cannot be rejected, m A - is constant or varies only slightly from 
pomt to point, so that minimizing the variance yields the maximum value of U 
s is seen from the tables above, when the moments are compared at the dif- 
ferent Points, the hypothesis may be accepted for one point and rejected for 

Z hZ'n+? y - he oT 111)16 0f usbg the point which y ield s the minimal variance, 
the hypothesis will be rejected more often than for other points. Thus, of all 

r thiS ° ne is m0st likely t0 re i ect a talse hypothesis 

, Pr °?T of determining for various moments how the origin may be 
chosen such that the variance of the distribution of these moments shall be a 
minimum is now considered, 
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3. First moments. In the case of the first moment, whose expected value is 

the mean of the population, the variance is given by -(1*2 — mi 2 ). It is obvious 

n 

that the choice of origin does not affect the variance of the first moment, since 
it is well known that /x? — ju( 2 is invariant with respect to choice of origin. 

4. Minimal variance of second moments. The variance of second moments 
about an arbitrary origin is - (ju( — u?) • Expressed in terms of n[ and central 


TABLES 


Curve A. 


Third moments 

Fourth momenta 

Point 

t 

Point 


0 

.0365 

0 

.197 

50 

.084 

50 

.697 

100 

.33 

100 

4.74 

120 

.77 

120 

14.17 

130 

1.28 

130 

26.76 

140 

1.91 

140 

49.03 

142 

1.95 

145 

45.26 

145 

1.90 

150 

42.89 

150 

1.60 

160 

21.31 

160 

.95 

180 

6.25 

170 

.57 

200 

2.51 

180 

.37 

300 

.183 

200 

.18 




Curve B. 


Third momenta 

Fourth, moments 

Point 

1 

Point 

t 

0 

085 

0 

.02 

39.3 

.19 

39.3 

.13 

89.3 

.69 

99 3 

.88 

109.3 

1 16 

109.3 

1.09 

119.3 

2.39 

119.3 

2.00 

129.3 

4.05 

129.3 

3.18 

139.3 

5.57 

133.3 

3.83 

149.3 

4.05 

135.3 

3.96 

159.3 

2.39 

137.3 

3.93 

169.3 

1.16 

139.3 

3.67 

179.3 

.98 

140.3 

3.46 

189.3 

.69 

143 3 

2.72 

199.3 

.50 

148.3 

1.59 

209.3 

.38 

159.3 

39 

239.3 

.19 

179.3 

.13 



239.3 

.07 


moments, this may be written 

(1) ^2 (m 2 ) = *(*4 — nl + 4+i3Mi + 4 m2Mi 2 )» 

n 


Here it is evident that the variance of second moments does depend on the 
choice of origin, and is not invariant under translation. 


The minimum value of is given by /i = — — and is -( im — vl — — Y 

2jl2 7l\ mj / 

Then we may write 


(2) 
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Throughout this paper m* denotes the second moment of the sample about 

an origin chosen such that n'i — — ^ , which is the value of Hi which minimizes 

(1); m\ denotes the second moment about an origin chosen such that nl = 0; 
m 2 denotes the second moment about the mean of the sample. It may be noted 
that in large samples the distributions of m“ and are approximately the same. 

It is clear from (2) that if ms = 0, or, if the population is symmetric, i.e. f{—x) 
- f(x), then p .1 (m*) = However, if p 3 0 then ^(mt) < . 


6. A moment inequality. Since the quantity given by (2) is essentially 
non-negative, an inequality is obtained valid for any distribution in which the 
first four moments exist, viz. 

a 

<3) /it — nl — — > 0 , m ^ 0 

M2 

or in standard moments 

(4) a* — aa — 1 > 0 

This is a stronger inequality than the one given by Beitelsen [3], i.e. a| — 
a 4 — 2 < 0 or the one generally known, on > « 3 , [4]. This inequality, however, 
was known to K. Peaison [5, p 432], although he derived it from a different 
point of view. 


6. Minimal variance of higher moments. The variance of the distribution 
of rth moments of random samples about an arbitrary origin always has a 
minimum. The variance of m' r is given by 

(5) Utirn'r) = - (jl2r — Mr 2 ) ■ 

n 

This expression when expanded in powers of is always a polynomial of even 
degree with the coefficient of the highest power a positive number. Further- 
more, by differentiating with respect to mi and equating the derivative to 

zero, the value of mi which minimizes tiiim'r) will be found among the solutions 
of that equation 

For third moments of samples the variance is given by 

M) = - [Me - 

which, when expressed in terms of moments about the mean and powers of the 
mean, becomes 

(6) - -hi 6 - Ms + 6 (m6 - MsM2)mi + (15mi - 9 nl)n'i + 18msMi 3 + 9M2M1 4 ]. 
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Differentiating with respect to n'i and equating to zero, we have 

(7) 6^1 + 9m8Mi 2 + (5*u — 3 m 2 ) mi + (mb — M3Mi) = 0. 

By straightforward application of the methods of solving cubics, it is easy to 
show by means of (3) that (7) has one real root only, which moreover is 0 

as 

as — an(\ai — fa 2 — 1) | 0. 

Since it can also be shown by means of (3) that the second derivative of (6) is 
positive, this root of (7) will minimize 
These facts demonstrate: 

Theorem I. The point about which the arithmetic mean of the cubes of the 
variates has minimal variance is to the right, at, or to the left of the corresponding 
point for the squares according as 

(8) a& — «3(f<*4 - H — 1) | 0. 


— accordmg 


By examination of (7) it is readily seen that if a 6 = a 3 or if the population is 
symmetric, the real root will be zero; so that for such a population the variance 
of third moments is a minimum when moments are taken about the mean of the 
population. If a 6 5 * a 3 the variance of third moments will be a minimum when 
taken about some other point 

For fourth moments of samples the variance is of the sixth degree in and 
its derivative therefore of the fifth degree. There is not much to be said in a 
general way except that if a 7 = a 4 a 3 or if the population is symmetric, Mi ~ 0 
will cause this derivative to vanish. 

If the distribution is a Pearson frequency function, from the recursion formula 
for the moments [6, p 24], 


where 


a 5 


/ 2ai -f - 4 -(- 25\ 

"»v~ — ; 


2 a* — 3 a 3 — 6 
a* + 3 


The criterion (8) can be written 
/ q \ /20C4 + 4 + 25\ , _ 3 , 

(9) a 3 l 1 — B ) + + ^«3 “ 3 « 4 a 3 . 

It will now be shown that (9) % 0 according as a 3 % 0, since (9) is a 3 D where 


( 10 ) 


D = 


2a* + 4 -f 25 


+ 1 + J«3 ~ | on 


1-5 
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It suffices to show that D > 0 for all Pearson curves. Using the method of 
Lagrange multipliers, it is possible to show that within the permissible range of 
values of the variables involved, the g l.b. of D is f , and so D > 0. It has been 

proved that the variance of the squares is a minimum when h'i = <r. It has 

just been shown that the sign of (9) agrees with that of aj. These, together with 
Theorem I, demonstrate 

Tpeobem II. For Pearson frequency functions, a s 9* 0, the point about which 
the variance of cubes is a minimum deviates more from, the mean than docs the cor- 
responding point for the squares. 


7. Symmetric populations. For the distribution of rth moments of samples 

(11) IHirn't) = - 04 — /r 2 ). 

n 


To find the minimum of (11) expand in terms of central moments and powers 
of hi , differentiate with respect to p [ , and equate to zero. This yields: 


( 12 ) 


(2 r - 2)rWi 5r -* + -•• + W r_. 

T _ Pr-i 


- £ 


r 

0 \i 


K 


+ ■ • • + 2r(m,-l — HrMr-0 — 0. 


For each power of , the coefficient is an isobaric moment function and is of 
even weight when the power of h'i is odd, and of odd weight when the power of 
Hi is even. If the population is symmetric the coefficients of even powers will 
vanish as will the constant term. Then h'i will be a factor, the other factor 
being a polynomial with only even powers of hi ■ In this latter factor, where K 
iB even, the coefficient of Khi k ~* is 


(13) (£) 

Since 


(13) may be written 



K 

QPlr-S + 2 bi(fl)r -x ~ H,~, Mr-X+l), 


r — i, K even, 


where a, bj are non-negative integers. 

It can be immediately established by use of an inequality due to Tchebycheff 
|7, pp. 43, 168] that hik+u k Hue Hu and therefore (13) is positive or zero. 

To sum up, if the odd moments vanish (12) will have a factor h'i and a factor 
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which is a polynomial with even powers only of with positive coefficients; 
therefore there is one and only one solution, p[ = 0 . This establishes 
Theorem III. For a symmetrical population, the distribution of rth moments 
of samples has minimal variance when the origin is the population mean. 


8. Distribution of second moments. To study in more detail the distribu- 
tions of m 2 and ml the higher moments are computed and compared. Applying 
the formula for the distribution of rth moments we obtain, for ml 

n'i(ml) = p2 

= i(fu — > 4 ) 

n 


(14) 


etc. 


<*4 


/ 0\ at — 3a4 -f- 2 

(mi) - 3 - ir« T fr.+ f ?- 3 
nL («4 — l ) 2 


-] 


For the distribution of m* , we get 
Piimt) = pi + 

4/12 

Pi{mt ) — -iui — p\ — — ^ 
n \ A»z / 

[ *s at — 3 a t + 2 + 3«3 — 3a 6 a 3 + 3onot\ 

(15) vsTc -■«!“)•'■ 

ai(m * ) — 3 = —[(a* — 4as + 6a4 — 3 + 12as a3 


4 

«3 


— 6a s — 4a7a 3 + 6a 8 a 3 — 12a4a 3 t 4aj — 4a 3 as 
+ a t ai)(a t — at — l)' 2 — 3] 


etc. 


Computing the ratios of a 3 's, we have 

flfil - Tl - “ 3 f 3 ( a6 ~ a — “/3a4 - a 3 ))1/.. 

^ ; a s (m£) L a 8 -3a4 + 2 J\ 

Similarly 

(m* ) — 3 
04 {ml) — 3 



(17) 


__ ^ _ 09(407 + 60403 4 ~ 4 a 3 a 6 d~ 12a 8 — 12 o 5 — 60509 — 03 — 0304) 

og — 4oj — 3o| -(- 12 o4 — 6 


1 
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It is evident that when « 3 = 0, the ratio in each case is unity. These ratios 
seem too involved to make any other general statements, but for particular types 
of populations these ratios in terms of the parameters are considerably simplified. 
To illustrate this statement, consider 

. _ e~ M M x 


From the foregoing formulas we compute 

m. \{m*) = M + i, Hi(m §) = M 

, * s 2 M 1 , 2M* + M 

w(wis) = — , Mma) = 

n *’ 


n 


(18) 

(19) 


«»(«?) . /T (2 M+ 1) 5/I 

cn(m\) V M 8M 2 + 22M + 1 

<u(mt) - 3 _ (12M 2 -f 36Af + 2)(2M + l) s 
on (mj) - 3 M(48M* + 384 M* + 112AT + 1) ' 


The minimum value of (18) is 0.71 for M — 1.22 and (18) is < 1 for M > 0.31. 
The minimum of (19) is 0.70 and is < 1 for M > 0.62. For the Poisson dis- 
tribution, then, not only is the variance of m* less than that of ml , but at least 
as far as the first four moments are concerned, the distribution of m * approaches 
normality more rapidly than does ml for all values of M > 0.62. 


When one follows the same procedure for 


r(p) 


jT'e 


1 it is found that not only 


is the variance of m 2 less than that of m \ , but as far as the first four moments 
are concerned, the distribution of m* approaches normality more rapidly than 
does ml, for values of p > 0.7. 

In the case of higher moments, it seems desirable to solve the necessary equa- 
tions in each particular case, since the equations are somewhat involved. 


9. Examples. A few examples are exhibited to illustrate the foregoing ideas 
and to contrast with some of the other methods. 

1. A sample of 1,000 is obtained with the following distribution 

x: 0 1 2 3 4 

/: 625 269 91 11 4 

—J/ 

The hypothesis being tested is that the population is/, = - — , with M = 

„ „ zl 

0 . 5 . 

x = 0.5 and therefore the mean does not differ from its expected value. 

By using the m<i test, we compute l = 2.06. If m* is distributed normally, 
the hypothesis is rejected at the 5% level. By using the ml test, we find t ~ 
1.45, and therefore by this test the hypothesis is not rejected at the 5% level. 
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Applying the x test, we find that the hypothesis is not rejected at the 5% level. 

2. We return now to the sample mentioned in the introduction. 

Since the parameters in population A were found by fitting the first two mo- 
ments, the tests will be made on the higher moments. From the definition of 
ml and m* it is clear what is meant by ml , m* , ml and m*. 

Consider the discrepancy of third moments in standard units t as a function 
of h, the distance from the origin. It is easy to see that 

t = ( ml — 


where 

Gr = - [p« ” Ma 2 — 6A(gs — MaMs) + Sh Z (5m — 3/i2 2 — 

n 

— 18h (nz — ntfn) 4- 9 — /*i*)]. 

For the ml test, h — 139.288. The value of h which minimizes the variance 
is a solution of 6(/i* — n'x)ti — 9 (ix'% — + (5/iJ — 3/^ - 2 p'tn[)h - 

(lit — wi) = which, for this population is h = 142.66. Using these values 
and computing, we find, for the ml teBt, t = 1.90 and for the m* test, t = 1.95. 

Using the same methods applied to fourth moment tests, we obtain for the 
ml test, h = 139.288 and t = 48.7, and for the m* test, h = 143.73 and t = 
52.4. 

The x test cannot be used here since the moments alone are given; further- 
more there is some difficulty in interpreting it under these conditions. 

In this particular example, the third moment test would not reject the hypoth- 
esis at the 1% level, while the fourth moment test would reject at that level. 

3, Since population B is symmetric, it is known that the ml and m* tests are 
identical; similarly for ml and m*. For the m* test, t = 5.67, which would 
reject the hypothesis at the 1% level. The fourth moment test would not be 
applied in practice. 

The writer wishes to acknowledge his indebtedness to Professor P. S. Dwyer 
for counsel and guidance. He also wishes to thank Professors H. C. Carver and 
0. C. Craig for valuable suggestions. 
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TOLERANCE LIMITS FOR A NORMAL DISTRIBUTION 1 


By A. Wald and J. Wolfowitz 


Columbia University and University of North Carolina 


Summary. The problem of constructing tolerance limits for a normal uni- 
verse is considered, The tolerance limits are required to be such that the prob- 
ability is equal to a preassigned value (3 that the tolerance limits include at least a 
given proportion y of the population. A good approximation to such tolerance 
limits can be obtained as follows: Let & denote the sample mean and s* the sample 
estimate of the variance. Then the approximate tolerance limits are given by 


x 



and 


£ + 



T8 


where n is one less than the number N of observations, x 2 ,p denotes the number for 
which the probability that x 2 with n degrees of freedom will exceed this number is 
0, and r is the root of the equation 


V 2 r 


T- 


The number x\,e can be obtained from a table of the % distribution and r can be 
determined with the help of a table of the normal distribution. 


1. Introduction. The problem of setting tolerance limits for a distribution 
on the basis of an observed sample was discussed by S. S. Wilks [I], [2] and by 
one of the present authors [3], [4], For a univariate distribution the problem may 
be formulated briefly as follows: Let x be the chance variable under considers' 
tion and let xi , • • • , x N be a sample of N independent observations on x, Two 
functions, Li and Li , of the sample are to be constructed such that the probabil- 
ity that the limits Li and Li will include at least a given proportion y of the popu- 
lation is equal to a preassigned value 0. The limits Li and L a are called tolerance 
limits. 

The following two cases have been treated in the literature: CD Nothing is 
known about the distribution of x, except perhaps that it is continuous, or that it 
admits a continuous probability density function. (2) The functional form of 
the distribution of x is known and only the values of a finite number of parameters 
involved in the dist ribution of x are unknown. We shall refer to (1) as the non- 

1 This paper reports work done by the authors in the Statistical Research Group, Divi- 
sion °f War Research, Columbia University, under oontraot OEMsr-618 with the Applied 
Mathematics Panel, National Defense Research Committee. The work was first reported 
m an unpublished memorandum, “Toleranoe Limits for a Normal Distribution” (SRG 
number 392, 3 January 1945) written by the authors, of whom one was a staff member and 
the other a consultant of the Group. The problem was suggested by W. Allen Wallis on 

the grounds that the limits previously proposed (see [4], section 5) are unsatisfactory for 
moBt practical purposes. 
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parametric case and to (2) as the parametric case. An exact solution of the 
problem for univariate distributions in the non-parametric case has been given 
by S. S. Wilks [1] His results have been extended to multivariate distributions 
by one of the present authors [3], An asymptotic solution of the problem in the 
parametric case, which may be used for large samples, was given in [4]. 2 

In the present paper we shall deal with the problem of setting tolerance limits 
for a normal distribution with unknown mean and variance. Approximation 
formulas are obtained which differ from the exact values by a magnitude of the 
order l/N\ They give much closer approximations to the exact values than 
those which can lie obtained by applying the general asymptotic results in [4] 
to the normal distribution In addition, the approximation formulas in the 
present paper have the advantage of considerable simplicity and can easily be 
computed with the help of tables of the normal and x distributions. To estimate 
the closeness of the approximation of the formulas given in this paper, a method 
of computing upper and lower limits for the exact values has been derived. Com- 
putations show that the approximation is good even for small values of N. A few- 
numerical examples arc given in section 7. 

2. Precise formulation of the problem and notation. Let xi , ■ ■ • , x„ be N 
independent observations from a normal population with mean n and variance 
a 1 , both unknown. We shall denote by £ the arithmetic mean of the observa- 
tions and by s 1 the sample estimate of the population variance a, i.e., 

y 

T,x t 

(2.1) S « *~L_ 
and 

(2.2) s 5 = £. fef - T - l , where n = N — 1 . 

For any positive X we shall denote by A(x, s, X), or more briefly by A, the propor- 
tion of the normal universe included between the limits £ — Xs and £ + X«, i.e., 

M A -.4 <*,.,*) " vkC 

A is a chance variable, since the limits of integration are chance variables. In 
this paper we shall deal with the problem of determining the value of X so that 
the probability that. A exceeds a preassigned value y is equal to a preassigned 
value j9. The desired tolerance limits will then lie given by £ - Xs and £ + As, 
respectively. In practice, the values /9 and y will usually be chosen near unity, 
frequently > .95, 

'Although the results obtained in the non-paratnolric case could be applied to the 
parametric case aa well, it would not be satisfactory to do so, since for the parametrio case 
methods having greater efficiency can be devised by taking into account the available in- 
formation regarding the functional form of the distribution. 
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It can be verified that the distribution of A does not depend on the unknown 
parameters n and <r. Thus we can assume without loss of generality that /i = 0 
and o- = 1 . 

For any given positive value X we shall denote by P( y,A) the probability that 
A > y. For a given value x we shall denote by P( y,\ \ x) the conditional prob- 
ability that A > 7 under the condition that the sample mean has a given value 
x It is clear that P(y,A) is equal to the expected value of P(y,\ \ x), i.e., 

(2.4) P(y,\) = f + “ P(y, X | x) d* . 

V 2rr 

3. Method of computing P(y,\ \ x) for any given values 7 , A and x. Since A 
= A(x,s,A) is a strictly increasing function of a, the equation in s 

(3.1) A(x,s,X) = 7 
has exactly one root in s. Denote this root by 

(3.2) s = r(x,y,X). 


Thus, r(x, 7 ,X) is that value for which 


(3-3) 


1 


,-i+Xr(J, yA) 
h-Xr(*,r,X) 


<f“' 2 


dt ~ 


7- 


It is clear that Xr(x,y,X) does not depend on X We shall write 


(3.4) Xr(x, 7 ,X) = r(x,y) . 

Obviously r(i, 7 ) is that value for which 


(3.5) 


1 

\/2ir 


/. 




e~" n dt 




7 . 


For given values of x and 7 the value r{x,y) can be obtained from a table of the 
normal distribution. 

Since A(x,s,A) is a strictly increasing function of s, the inequality A(x,s,X) > 
7 is equivalent to the inequality s > r(x, 7 ,X) = r(x,y)/\. Hence, since x and s 
are independently distributed, we have 


(3-6) P( 7 ,X | x) = P(s > r(x, y)/X) 

where P(s > c) denotes the probability that s > c for any constant c. In gen- 
ral, for any relation R we shall denote by P(R) the probability that R holds. 
Since ns has the x distribution with n — N — 1 degrees of freedom, we have 

(3.7) p( fi > = P (xi > vaL 

where Xn stands for a random variable which has the x distribution with n 
degrees of freedom. The probability on the right-hand side of (3.7) can be ob- 
tained from a table of the x distribution 
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Hence, we see that the computation of P{ y,\ \ x) for given values y,\ and x 
can. be carried out in two simple steps. First we determine the value of r(x,y) 
from a table of the normal distribution and then read the value of 




from a table of the x distribution. 

4. Proof that the difference p(y,\ ~ P(y,X) is of the order l/JV a . It 

is clear that P( y,\ | x) is an even function of x. Hence, in the expansion of 
P( 7 ,X | x) in a power series in x, only even powers will occur. Termmating 
the Taylor expansion (in section 8 we prove its validity) at the fourth term, 
we have 

(4.1) P(y, \|s) = P(7,X | 0) + | 
where 0 < £ < x. 

The expected value of P(y,\ | x) (considering x as a random variable) is 
equal to P(t,X). Since the expected value of x is 1/N and the expected value 
of 

x 1 cfP 
4! dx* *-£ 

is of the order 1/N 2 (this is proved in section 9), we obtain from (4.1) 


, * 4 ^P(t,X[x) 
i-ii ' 4! dx* 


2=E 


(4.2) 


P(,,X),P( T ,X|0) + 5 t^ 


+ 


°Cs*> 


On the other hand, substituting 1/y/N for x in (4.1) we obtain 

( 4 3) W | - W 10) + ^^ L + JW' r p 1 

where 0 < < l/y/N. Hence, since the second term of the right member 

of (4.3) is of the order 1/JV 2 , 


i-E' 


(4.4) 


P(y,\ 




... + °(.V’) 


From (4.2) and (4.4) it follows that 


(4.5) 


P(r 


,X) - P^ t,X 


Viv) 0 (n 1 ) 


Thus, this difference approaches zero rapidly as N — » oo . 


5. Computation of the value X for which P 


V* 1 


takes a preassigned 


value 0 Denote by x r. ,p that value for which P(x„ > x n s) = P- This value can 
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be obtained from a table of the % distribution. From (3.6) and (3.7) it follows 
that the required value X* of X is given by the root of the equation 



Thus, the desired value of X* is given by 

(5- 2) = r (vf ’ 7 )' 


Ur 7 

normal distribution. 


The value r 


is defined by (3.B) and can be obtained from a table of the 


(6.1) 


6. Lower and upper limits for P(y,\) As mentioned in section 2, P( 7 ,X) is 
equal to the expected value of P(y,X | x). Thus, 

P( 7 ,X) = P(7,X I £)c~ iK *’ dx. 

To obtain upper and lower limits for P(y,X), we shall construct upper and lower 
limits for the integral on the right-hand side of (6.1). It can easily be soen 
that P( 7 ,X | x) is a strictly decreasing function of x 2 . Hence, to obtain lower 
and upper limits for the integral in the right member of (6.1) we can proceed 
as follows: Choose a positive constant d and a positive integer fc. Denote by 
tti the probability that id < x < (i l)d, (i = 0, 1, • • • , fc— 1) , and let cm, be the 

probability that x>hd. Then 2Z a,P(-y,>» | id) is an upper bound, and 2Z «.-i 


P( 7 ,X | id) is a lower bound of the integral in question. 

( 6 . 2 ) 
and 
(6.3) 


Thus 


i-i 


P(7,X) > 2 Z a^P(y,\ | id) 
1-1 

P(y,X) < 2Za«P(7,X|id). 


The two limits can be brought arbitrarily close to each other by choosing d 
sufficiently small and k sufficiently large. A method of computing P( 7 ,X | x) 
for any given value x has been described in section 3 and the quantities can 
be obtained from a table of the normal distribution. The amount of compu- 
tational work, however, increases rapidly with increasing k. 

1 The Statistical Besearch Group computed, under the supervision of Albert H, Bowker, 
a table of tolerance limit factors X (see formula 5.2) for p =» .76, .90, .95, .99; y *» .76, .90, 
.95, .99, 999, N = 2 (1) 102 (2) 180 (5) 300 (10) 400 (25) 750 (50) 1000, Mr. Bowker also 
developed an asymptotic formula for X (published elsewhere in this isBUe of the Annals) 
which, when (3 < ,99, 7 < 999, and N > 160, agrees with (5.2) to within 1 unit in the third 
significant figure The Applied Mathematics Panel plans to publish the table and ft brief 
explanation of tolerance limits in the volume entitled Techniques of Statistical Analysis de- 
scribed in the footnote on page 217. 
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7. Approximate determination of the tolerance limits. The exact tolerance 
limits are given by x — Xs and x + Xs where X is the root of the equation in X 

(7.1) P( 7,X) = 0. 

This equation has exactly one root in X, since P(y,X) is a strictly increasing 
function of X. Denote this root by X = Thus, the exact tolerance 

limits are given by x — X(d,y)s and x + \Q3,y)s. 

We have seen in section 4 that p(y,X | j closely approximates P(t,X), the 

difference being of the order 1 /N 2 . Thus, a close approximation to X(d/y) can 
be obtained by solving the equation in X, 

(7.2) P(y,\ !^=) = /3. 

This equation has again exactly one root in X, since P (y,\ \ ^T^is a strictly 

mcreasing function of X. Denote the root of equation (7.2) by X = X*(/3 ,t) . 
Thus approximate tolerance limits are given by x—\*{p,y)s and i+X*(/3,7)s. 
In section 5 it has been shown that 


(7.3) = a/ r 

f X ni(3 

where n = N— 1, x»,fl is that number for which the probability that x with n 
degrees of freedom exceeds this number is (3, and r is the root of the equation 

1 fllVif+r , 

(7.4) -7= / e 1 '* dt = y . 

V V2 W Jllv'N-r 

The number x 2 ,/s can be obtained from a table of the x 2 distribution and r can be 
determined from a table of the normal distribution. 

Since X*(/3,7) is only an approximation to \(P,y), P[7,X*(/3,7)] will differ slightly 
from /S. To judge the goodness of the approximation of \*((3,y) to the exact 
value X(/3,7), it is desirable to derive upper and lower limits for the difference 
P['V.X*(d,7)] — 0. Such limits can be obtained by computing upper and lower 
limits for P[y,\*(l3,y)] using the method described in section 6. 

We cite here a few numerical examples to show the goodness of the approxima- 
tion. 



**((?, t) 

Upper limit 

Of P[7,X*0 *,t)] 

Lower limit of 
P[ T.X*(/3 ,y)] 

37.674 

.95202 

.95077 

4.550 

.98989 

.98908 

2.631 

.95161 

.94393 

2.972 

.99024 

.98813 
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8 . Validity of the Taylor expansion of P(y,\ | x). Weahall show that P(y,\ |£) 
has derivatives of all orders at every point x, y and A being fixed. This is 
sufficient to validate the Taylor expansion used in section 4, 

For typographical convenience write 

r(£,y) = R. 

We have 

1 r i+ * 

(s- 1 ) v^L 6 dt ~ 7 - 

Differentiating (8.1) with respect to x we obtain 


whence 

(8.3) — = tanh xR . 

(IX 

Now the analytic function tanh z of the complex variable z has only purely imagi- 
nary singularities. Hence R possesses derivatives of all orders for all real values 
of x. 

Now 

P(y,\ | 4) « P (a > - 1 - k £ i”" 1 dt 

where A is a constant. Hence from (8.3) 

(8.4) = -hR"' 1 e - 1 ' a,n2Xi) tanhxB . 


The right member of (8,4) is a product of functions which are analytic in the 
entire (complex) R plane by a function which possesses derivatives of all orders 
for every real x. Since R possesses a derivative (with respect to x) for all real 
x, it follows that P possesses derivatives of all orders for every real x. 


9. Proof that 


E 


.4! 



Since R is a minimum at x = 0 it follows that P(y,\ \ x) has a maximum there. 
Hence, from (4.1), the quantity 



4! bfr 
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is never positive. Therefore 

d 4 P 

dx 4 




a: 2 dx 2 


£b0 


is bounded above for | x | > 5, where 5 > 0 is arbi- 


d* P 

Consequently — 

OX 1 

trarily small. Since P possesses everywhere derivatives of all orders, the fourth 
derivative is continuous and hence bounded above for j x | < 5. From this we 
d* P 

obtain that — is bounded above for every real x. 
ox 4 

Since P( y,X | x) is always positive we have, from (4.1), that 


dx 4 


> - 




12 2 P + £ 


_ 2 d 2 P 


dx 2 


la = 0 / 


x-t 


For | * | greater than a sufficiently large number C, the left member of the 

3* P 

above inequality is thus bounded below. For | x | < C we have that — 

OX 

d* P d* P 

is bounded below because — is continuous. Hence ■— r 

di ? dr 

low for every real x. 

n> a 4 p 

Smce w 


is bounded be- 


i-t 




is bounded above and below for every real x, the desired 


result follows. 
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APPROXIMATE FORMULAS FOR THE PERCENTAGE POINTS 
AND NORMALIZATION OF t AND x s 1 

By Henry Goldberg 8 and Harriet Levine 

Statistical Research Group, Columbia University 

1, Introduction. The x 5 Distribution and Student’s ^-distribution are Func- 
tions of a parameter (degrees of freedom) and approach the normal distribu- 
tion as n approaches infinity, The normal distribution is a good approxima- 
tion to these distributions for large n. For small or moderate n, a better 
approximation may be obtained by using a function of £(or x) which approaches 
the normal distribution more rapidly as n increases. Hotelling and Frankel 
[7] pointed out that an additional advantage of the normalization of a distribu- 
tion is that further statistical tests are possible with the normalized variate. 
Normalizing t ( or x) is equivalent to transforming it into a function which is 
normally distributed to a required degree of approximation; that is, a normally 
distributed variate of zero mean and unit variance is expressed as a function of 
i(or x) in powers of 1 /n. 

The reverse problem of expressing t(o r x*) as a function of a normally dis- 
tributed variate of zero mean and unit variance in powers of 1/n is also of prac- 
tical importance in connection with significance tests for which the significance 
levels, or percentage points, of the t and % 1 2 distributions aro required. 

Cornish and Fisher [1] (see also [2]) have given a method for the normalization 
of distributions which approach normality as the number of degrees of freedom, 
n, increases and whose cumulants are expressed in power series of 1/n, so that 
the order of magnitude of the rth cumulant is that of A method has 

also been given for expressing a variate with such a distribution as a function 
of a normally distributed variate of zero mean and unit variance in powers of 
1/n. 

It is the purpose of this note to apply the Comish-Fisher method (1) to the 
derivation of asymptotic formulas for the percentage points of the t and x* dis- 
tributions and (2) to the normalization of these distributions. Tables are 
given which indicate the accuracy of these approximations and compare them 
with other approximations. Tables are also given to facilitate the calculation 
of the approximations for the percentage points of ( and 


1 This paper reports work done in the Statistical Researok Group, Division of War Re- 
search, Columbia University under oontract OEMsr-618 with the Applied Mathematics 
Panel, National Defense Research Committee, Offioe of Scientific Research and Develop- 
ment. The work was first reported in an unpublished memorandum, ^Application of the 
Cornish-Fisher method to an approximation of the significance levels of l and x’" (SRG 
number 507, April 28, 1945) 

• Henry Goldberg died April 19, 1945. 
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2. The Cornish-Fisher method. 3 Consider the random variable y with 
probability distribution functional/), expected value E(y), and variance d\y) . 
Let K r denote the rth cumulant of y and a T denote the rth relative cumulant of 


y, i.e , a r = zpa . Let x denote a normally distributed variate with zero mean 
K-2 

and unit variance. 

For every p, (0 < p < 1), let y p be defined by 


and x p by 



V 



y/ 2 tt 


e ~ (r2 ' w dr 


V- 


That is, corresponding to every y p , there is an x p having the same probability 
mtcgral (p ) . The Cornish-Fisher Method for expressing a normally distributed 
variate with zero mean and unit variance as a function of a standardized variate 
with the same probability integral gives 


(1) x P ~ bo + &i z j> T" biZ p biz* P + + b&p + • ■ • 


where z„ is the standardized variate corresponding to y p ; i.e., 

, = Vp ~ E(y) 

p *(V) 

and the b { are defined in terms of the relative cumulants. 

Cornish and Fisher give also the following expansion for a standardized vari- 
ate as a function of a normally distributed variate: 

(2) Z p ■ ' Co -J- C\Xp -f- Oix\ CjXp -j- C*Xp -J- CfX^p -f- ■ • • 


where the c, are defined in terms of the relative cumulants. 


3. An approximation for the percentage points of Student’s ^-distribution. 
The standardized variate z = l can be expressed as a function of the 


normal variate, x, in powers of 1/n by using the Comish-Fiaher equation (2). 
Omitting terms of degree greater than two in 1/n gives, after simplification, the 
following asymptotic expansion for t: 


( 3 ) 


l ~ x -f- 


+ » . 

in 


fix 8 + 16x 3 + 3x 
96 n 2 + 


a Churchill Eisenhart suggested the use of the Cornish-Fisher Method for obtaining per- 
centage points of the chi-square distribution not given in existing tables, a problem which 
arose m several connections, including the computation of a table of factors for tolerance 
limits for normal distributions according to two formulas devised in the Statistical Re- 
search Group, one by A. Wald and J Wolfowitz and the other by Albert H Bowker, both of 
which are published elsewhere in this issue of the Annals of Math. Stat. The table will be 
included in. a volume by the Statistical Research Group, Techniques of Statistical Analysis, 
to be published by the McGraw-Hill Book Company in 1946; its preparation, including the 
work reported in the present paper, was directed by Albert H Bowker; the Statistical Re- 
search Group was directed by W. Allen Wallis. 
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For simplicity, the subscript p which appears in the Comish-Fisher equation 
(2) has been dropped. It should be understood, however, that the x and t used 
in expansion (3) have the same probability integral. It is interesting to note 
that the first two terms were derived by Peiser [4], 

TABLE 1 


Table of Polynomials Required for the Approximation for the Percentage Points 

of the t-dislribulion* 


Probability Integral 
(p) 


/.(*) 

Mx) 

.999 

3.090232 

8.150129 

19.692529 

.9975 

2.807034 

6.231221 

12.850916 

.995 

2.575829 

4.916548 

8.834762 

.99 

2.326348 

3.729074 

5.719746 

975 

1.959964 

2.372271 

2 822499 

.95 

1.644854 

1.523769 

1.420203 

.90 

1.281552 

,846585 

.570891 

.75 

.674490 

.245335 

.079490 


* This table can be used for determining a;, f t (x) and f } ( x) corresponding to 
the complements of the selected values of p by using the relations 


Xx— V — Xp 

fi(~x) = -fi(x) 
M~x) = -fi(x). 


To facilitate the use of the approximation, tables of the required polynomials 
in x have been computed for selected probability integrals. The approxima- 
tion can be written 

... 

n n 2 

where 

Jxip) = X> ~~ 

and 

, / V _ 5.T 5 + 16z a + 3x 
96 

Table 1 gives values of x p (or rr), fi(x) and fi(x) for selected values of the prob- 
ability integral p Table 2 gives approximate and exact percentage points of t 
for selected values of p and degrees of freedom. The exact values were taken 
from Merrmgton [5], Table 2 shows the high degree of accuracy of the three 







TABLE 2 

Comparative Table of Approximate and Exact Values of the Percentage Pointy 

of the t-disiribution 


Probability 
Integral (p) 

Degrees of 
Freedom 

Approximate Percentage Point 

N ormal 

2 Term 

3 Term 

.9975 

1 

2 8070 

9.0383 

21 8892 


2 


5.9226 

9.1354 


10 


3 . 4302 

3.5587 


20 


3.1186 

3.1507 


40 


2.9628 

2.9708 


60 


2.9109 

2.9145 


120 


2.8590 

2.8599 

.9950 

1 

2.5758 

7 4924 

16.3271 


2 


5 0341 

7.2428 


10 


3 . 0675 

3 1558 


20 


2 8217 

2.8437 


40 


2.6987 

2.7043 


60 


2.6578 

2.6602 


120 


2.6168 

2.6174 

.9750 

1 

1.9600 

4.3322 

7.1547 


2 


3 . 1461 

3 8517 


10 


2.1972 

2.2254 


20 


2.0786 

2.0856 


40 


2.0193 

2.0210 


60 


1.9995 

2.0003 


120 


1.9797 

1.9799 

.9500 

1 

1.6449 

3.1686 

4 5888 


2 


2.4067 

2.7618 


10 


1.7972 

1.8114 


20 


1.7210 

1.7246 


40 


1.6829 

1.6838 


60 


1.6702 

1.6706 


120 


1.6576 

1.6577 

.7500 

1 

0.6745 

.9198 

.9993" 


2 


.7972 

.8170 


10 


.6990 

.6998 

, 

20 


.6868 

.6870 


40 


.6806 

.6807 


60 


.6786 

.6786 


120 


.6765 

.6765 


Exact Per- 
centage Point, 


127.32 

14.089 

3.6814 

3.1534 

2.9712 

2.9140 

2.8599 

63.057 

9.9248 

3.1093 

2.8453 

2.7045 

2.0003 

2.0174 

12.700 

4.3027 

2.2281 

2.08(H) 

2.0211 

2.0003 

1.9799 

6.3138 
2 . 9200 
1.8125 
1.72-17 
1.0839 
1.0707 
1.0677 

1.0000 
.8105 
.0908 
• 0870 
.0807 
.0786 
.6700 
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term approximation for n > 10 and the superiority of this approximation over 
the two-term approximation derived by Peiser. 

4. An approximation for the percentage points of the x 5 distribution. The 

y} ' — ' ^ 

standardized variate z = — 7 =- can be expressed as a function of the normal 

V 2 n 

variate, x, in powers of 1/n by' using the Cornish-Fisher equation (2). Retain - 


TABLE 3 

Table of Polynomials Required for the Approximation for the Percentage Points 

of the x s distribution* 


G,(x) 


Gi(x) 


Gi(x) 


G t (x) 


Gi'x) 


Probability 
Integreal (p) 


.999 

9975 

.995 

.99 

.975 

.95 

.90 

.75 


4.370248 

3.969745 

3.642773 

3.289953 

2.771808 

2.326174 

1.812388 

.953873 


5.699690 
4.586292 
3.756598 
2.941263 
1.894306 
1.137029 
.428250 
- 363376 


.619006 
.193953 
- .073888 
-.290266 
-.486382 
-.554981 
-.539450 
-.346842 


-1.602112 
-1.113149 
-.802518 
-.541971 
-.272398 
- . 122957 
-.017722 
.060220 


1.273498 
.875184 
.622768 
.411597 
. 194832 
.077898 
.002186 
-.030881 


* This table can be used for determining the Gt(x) for values of m correspond- 
ing to the complements of the selected values of p by using the relations 
•El — jj ** *Ej> 

i Gi(-x) = (—!)*(?<(*), for % = 1, . . . , 5. 


ing terms in n* 1 * gives, after simplification, the following asymptotic expansion 
for x’: 

,.x 2 . „ , x » . „ , x . Giix) Gi{x) , G t {x) . 

(4) x ~ n 4- Gfx)n + G 2 (x) + — p + — — + — p + • • • 

n n n’ 

where 

G.fij = ypix 
<h(x) = ¥.x - 1 ) 


Gi ^ 9\/2 ^ 

&(*) = (fix + 14x J - 32) 

Gi(x) = 4860V2 (9X + 256x1 ~ 433l) - 
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As before, the subscript p which appears in the Comish-Fisher equation (2) has 
been dropped. The x and x 2 which are used in expansion (4) have the same 
probability integral. The first four terms were derived by Peiser [4]. 

Table 3 gives values of the Gfx) for selected values of the probability in- 
tegral p. Table 4 compares various approximations with the exact percentage 

TABLE 5 

Comparative Table of Approximate and Exact Values of the Probability Integral oft 


Probability Integral of t 


t 

n = 1 

71 =* 2 

n — 

m 

71 — 

20 


Approxi- 

mate 

Exact 

Approxi- 

mate 



Exact 

Approxi- 

mate 

Exact 

Approxi- 

mate 

Exact 

0.1 

.5311 

.5317 

.5351 

I 

.5388 

.5388 

.5393 

.5393 

1 

.7734 

1 

.7917 

■ 

.8296 

.8296 

.8354 

.8354 

3 

1.0000 



.9523 

.9954 

.9933 

.9967 

.9965 

5 

1.0000 

.9372 


.9811 

1.0000 

.9997 


1 0000 

6 

1.0000 

.9474 


.9867 


.9999 


1.0000 


TABLE 6 

Comparative Table of Approximate and Exact Values of the Probability Integral of x 2 


Probability Integral of x 2 


X 1 

n 2 

71 = 

■ 10 


20 

n = 

29 


Approxi- 

mate 

Exact 

Approxi- 

mate 

Exact 

Approxi- 

mate 

Exact 

Approxi- 

mate 

Exact 

1 

.3963 

.3935 

.0010 

.0002 

.0000 


*1 

.0000 

5 

.9646 

.9179 

.1098 

.1088 

.0004 


tajjja 

.0000 

10 

1.0000 

.9933 

.5594 

.5595 

.0323 



.0004 

20 

1.0000 


.9768 

.9707 

.5420 

.5421 

§&£ J 

. 1071 

30 

50 

1.0000 



.9991 

.9305 

■ 

ooou 

.9916 

.5860 

.9910 


points of x 2 for selected values of p and degrees of freedom. The Peiser four 
term approximation, the Wilson-Hilferty approximation, 



and the Fisher approximation, 

Xp - K x v + V2n - l) 2 
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are given for comparison. The exact values were taken from Thompson [6]. 
Table 4 shows the high degree of accuracy, and the general superiority of the 
Cornish -Fisher approximation, for n > 10. For low probabilities (.005) the 
Peiser approximation is often better than the full series, for small n, (1, 2), the 
Wilson-Hilferty approximation is often better. 

6. Normalization of t and x 2 - The Cornish-Fisher equation (1) applied 
to the (‘-distribution or, alternatively, a formal reversion of the power series 
(3) gives the asymptotic expansion 

(5) a ~ t\l - t±l + — ±*L± 3 + ■ ■ . 

4 n 96 n 1 

Expansion (5) agrees with the first three terms of an expansion derived by Ho- 
telling and Frankel [7]. 

Applying the Cornish-Fisher equation (1) to the x distribution gives the 
expansion 

1 ~ slisoW' i" 68649 " + l128469 *' + 29056 1 

(6) - | [53553 X 4 + 2208x 2 - 386] + 1 [34257* 6 + 792 x 4 + 238 x 2 ] 

- i [25221 x s + 304 X 6 ] + ~ x’° + • • 

n n* \ 

6. Accuracy of the normalizations of t and x 2 - The accuracy of the normaliza- 
tion (5) of i may be judged from Table 5, which compares the approximate value 
of the probability integral with the exact value. The approximate value is the 
normal probability integral corresponding to the value of x computed from (5) 
for the given values of t and n. The exact values were obtained from Student’s 
tables [8], For fixed n, the approximation improves as l decreases from mod- 
erate to small values. The approximation appears to improve as t increases 
from moderate values (about 3) to large values because of the more rapid ap- 
proach to unity of the probability integral of a normal variate. 

The accuracy of the normalization (6) of x s may be judged from Table 6, 
which compares the approximate value of the probability integral with the exact 
, value. The approximate value is the normal probability integral corresponding 
to the value of x computed from (6) for the given values of x and n. The exact 
values were obtained from the table of Pearson [9], 
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THE EFFECT ON A DISTRIBUTION FUNCTION OF SMALL CHANGES 
IN THE POPULATION FUNCTION 

By Burton H. Camp 


Wesleyan University 

1. Summary. It is generally assumed in the application of distribution 
theory that, if the actual population function is not very different from the one 
used m the theory, then the true sampling distribution of a statistic will not be 
very different from the one obtained in the theory. But elsewhere in mathe- 
matics we do not assert that a conclusion will be only slightly modified by a small 
deviation in the hypothesis. This paper presents some theorems which are 
useful in determining the maximum effect on a sampling distribution of certain 
kinds of small changes in the population function In particular, if the popula- 
tion is denoted by the function if a sample of n independent measurements 
(k , ■ • • , t n ) is taken from this population, if a statistic x = g{k , ■ - ■ , t„) is 
formed from the sample, and if D(x) denotes the distribution of this statistic; 
then, when <f> (£) is changed by a small proportionate amount to <fn(L), D(x) will 
be changed to Di{x), and the relation between D and Di will be subject to the 
inequality: 


[ (D - Di)dx g e f D(x)dx, 

Ja Ja 


« = (1 — (- 5) 71 — 1, and 


- 1 I < 5. 


2. It is generally assumed m the application of distribution theory that, if 
the actual population function is not very different from the one used in the 
theory, then the true sampling distribution of a statistic will be not very different 
from the one obtained in the theory. For example, we commonly apply to 
practical problems the distribution theory that has been obtained on the hy- 
pothesis that the population is normally distributed even though we know that 
our actual populations are only approximately normal in form, and we commonly 
assume that our results are approximately correct. But elsewhere in mathe- 
matics we do not assert that a conclusion will be only slightly modified if we only 
slightly modify the hypothesis. An example of our unwillingness to do this 
in other branches of mathematics is illustrated in the following example. 

Example 1. Let y = $(1) have the derivative y 1 ~ $'(<). Let 4>(t) be re- 
placed by 4>i(f), where fa - <t> = s(t)4>(t), and | s(t) | < «, e being small. We 
have thus chosen to make (fa — <f) small relative to 4> rather than small abso- 
lutely so that this example may be useful in another connection. The derivative 
of <f > i may of course differ very greatly from fa(l), as for example in some of the 
approximations made by a few terms of a Fourier series; and it would be a major 
error to assume that the two derivatives are approximately equal. How can we 
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be sure that, in the process of finding a distribution function, we are not m airin g 
an error of the same 1 sort? 

The following theorems partly answer this question. The theorems will first 
be stated and proved in great generality. Then we shall return to the functions 
in Example 1 as a special case. We shall be concerned with a sample consisting 
of a single observation of n measurements (fi , • • • , t„) drawn from the multi- 
variate universe \p(U , ■ ■ • , Q, or, more briefly, with the vector T as a sample 
from the n-way universe ^(T). Throughout this paper and shall be func- 
tions which are non-negative and whose integrals over the entire spaces of their 
definition are unity. Let the statistics (xi , • ■ ■ , x m ), or more briefly the vector 
X, be constructed from T thus: 

(1) & = 9i{T), ,x n = Qm(T). 

If now p represents any ineasurable point set in X space and if dX is used for 
(dx i ■ ■ • dx m ) and dT for (dt, ■ • • dt„), a fundamental theorem [1] of distribution 
theory asserts that, if q is the point set in T space for which X is in p, then the 
distribution D{X) is determined by the equation, 


(2) 


f D(X) dX = f \p(T)dT, if these integrals exist. 
Jr, Jq 


Theorem 1. Using the foregoing notation, let \f(T) be replaced by \f>, (T) and 
let ^i(T) — \p(T) = \p(T)S(T), where | S j < «, and as a consequence let D(X) be 
replaced by D,(X); then 


( 3 ) 


f Di(X)dX - J D{X)dX < e J D(X)dX < 


To prove these inequalities we merely need to notice that the point set q 
depends on the g’s but not on the universe, and that therefore we may use the 
same p and g as in (2) in the following equation which determines D\ : 


(4) [ D,(X)dX = f MT)dT. 

Jp Jq 

Subtracting (2) from (4) we obtain 

(5) I f DidX - [ DdX = j f (D l —D)dX = I f(h~ 4>)dT 

I Jp Jp I Jp I I J Q 

= I f \f>SdT < e I f +dT = * I f 

Jo I Jq •'P 


DdX 


< «, 


1 The general question being raised here has been approached heretofore from differ- 
ent points of view. In particular, other exact population functions besides the normal 
have been studied, and in some cases the distribution theory has not been greatly dis- 
turbed as a result Also, the effects of slight changes in the parameters of a population 
function have been studied. 
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since 0 is never negative, and the integral of D is never greater than unity. It 
should be noticed that the final inequality of (5) is independent of the g’a, al- 
though this is not true of the preceding inequalities, which do depend on the 
g’s because they involve p and q. 

Corollary. In particular 2 let \ p — 0(h) ■ • • 0(f„)> where 0(f) defines a one- 
way universe function, and ti , t n are independent samples from it. Let x = 
g{k, , O- Then, if 0(1) is replaced by and if 4>i — 4> = s(t)4>(t), and if 
I s{t) | < d, and if D(x) is the distribution of x before the replacement, and Dfx) 
is the corresponding distribution after the replacement, 

where 

t = (1 + 8) n — 1, and — <x> < a < b < °o. 

This corollary follows from the theorem because of the universe, 

0(h , * • ■ , O = 0(h) • • • 0(0, 
and 



0i(h , ■ • • , O — 0(h) • • • 0(O[1 + s(h)] * • • [1 + s(f„)]> 
so that, in the notation of the theorem, 

MT) = 0(T) + t(T)S(T), 

where 

S{T) = [s(h) +•••• + »( 0 ] + la(h)s(h) + • • ■ + s(f„_i)s(0] 

+ • 1 ' + [* (ft) • * • 8(0]. 

Hence 


8 


nB -f- 


7ll 


21(n - 2) ! 


+ 


S n 


(1 + t) 9 - 1 - e. 


The interval (a, b) now replaces the point set p of the theorem. 

This theorem and its corollary are powerful in that they may be applied to all 
statistics, but they are weak because of the restrictions on S(T) and s(t). It is 
to be noted also that the corollary is ineffective when n is large, a difficulty which 
seems to the author to be implicit in the sampling process. The restrictions on 
s(t) make it impracticable to apply the corollary to the following exaS^ple since, 
as will be observed, if 1 1 ] > c, 0, - 0 • -0, and so then Is! = 1; and when 
5 = 1, « = 2 n — 1. 

Example 2. Let 0(f) = (2^ V'*' 2 in (- «o, =o), and let 0i(f) = A( 2r)~ 1,J 
e m ( c, c) and let 4>i(t) = 0 if | 1 1 > c, where c is not infinite and A is so 
chosen that the integral of <f> i over ( — oo ^ oo ) is unity. 

This type of example is important because, in the attempt to apply the theory 
of normal distributions to practical matters, the first discrepancy that appears 


'One could as well uae • . 
its importance. 


0 w (tn), but we ohoose 


the simpler case on acoount of 
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is that in the theory the given distribution is infinite in extent while in practice 
it is finite. The following theorem generalizes the preceding one so as to permit 
it to apply to this example. 

Theorem 2. Let all of T -space be divisible into two parts, Q 0 and Qi , satisfying 
the following conditions. In Q 0 let MT) - f{T) = S(T)f(T), and let | S(T) \ < 
e. In Qi let MT) = 0, and let 

f mar) < «i. 

Then 


[ Di dX — f DdX < e [ DdX + «i < « -J- 

^P ‘'O J® 


It is not required that Qa or Qi be the totality of points for which its attendant 
conditions are true. 


Proof. As before, if the integrals exist, 



and 




Hence 

f DidX - f DdX = f (fc - +)dT = ( (fc - i)dT + [ {h~ MT, 

J p J a J at J <n 

where qo is that part of q which is in Qa , and q\ is that part of q which is in Qi. 


(6) 

(7) 


1 [ Di dX - f DdX 

< 

If (* - MT 

+ ] f M - MT 

I Jp Jp 


1 *'ffo 

1 

f (*1 - MT = ( SWT 

J<1 0 J flo 

< E f WT 
'‘at 



( 8 ) 



< E 

f \pdT = e 

f DdX, because ^ > 0, 


1q 

J p 

| J (ft - MT 

- ! U" 

^ f \pdT ^ Ci , 

J Ql 


because ^i = 0 in qi . The inequalities (7) and (8), when substituted in (6), 
prove tjjgsttheorem. 

Corollary, In particular, let f, and x be defined as in the corollary to Theorem 
1, and lei Mt) be so defined that, if | < | < c, M l ) — <j>(t) — s{t)<j>{t), where as before 
I «(0 I < h, and t <= (1 + S) n — 1; and, if\ t \ > c, lei<t> i(Y) = 0. Also let 

/ <j>(t) ■ • ■ 4>{l n )dT < ei where Qi is the set where \ 1; \ > cfor at least one value 

of i. Then 


f Di(x)dx — f D(x)dx < e f D(x)dx + £i < e + 

Ja •'a 


provided these integrals exist. 
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Proof. This corollary is implied in the theorem if we let f(T) = 

• • • <j>(Q and \h(T) = • ■ • <f>i(0> and then let Qo be the point set in 

T-space where 1 1, | < c for all values of i, and Qi be the point set where | U | > c 
for at least one value of i. As in the corollary to Theorem 1, p becomes the 
interval (a, b ). 

Example S. Let <t> and 4n be as in Example 2, and choose c = 3. Then A = 
1/0.9973 = 1.0027, and 


f <t>(k)---<Ktn)dT = 1 - (.9973)". 

•/g. 

This quantity may be taken as «i . Also 

| (<#>i - 4>)/<p | = | A - 1 | = 0.0027. 

This quantity may be taken as S. Then t = (1.0027) * — 1. Hence 

I pb pb pb 

/ Di{x)dx — / D(x) dz < e I D(x)dx + ei . 

J a Ja <Ja 


If w is not large, an approximate value for both e and ei is 0.003ft. This quantity 
is not particularly small unless n is small, but it could not be expected to be 
very small since the corollary pertains to all statistics of the form x = 
q(U i • ■ ■ i 

Example 4. In one of the author’s earlier papers [2] he found the distribution 
of the geometric mean, x — (jh ■ ■ in) 11 ” , of n observations chosen from the 
universe described by the so-called curve of equal facility, whose equation is 


y = 


l 

tcy/2v 


g -(UJc»Klo ( ll<3)< 


The author stated that there was about as good justification for assuming that 
the distribution of statures was given by that universe as for assuming that it 
was normal. After one more theorem we shall now be able to state that, if one 
wishes to cling to the assumption that the distribution of statures is normal, then 
the distribution of the geometric mean is close to the distribution found in that 
earlier paper. We do need another theorem for this because we should be deal- 
ing with two distributions, and <p(t), which do not obey the requirements of 
the corollary of Theorem 1, because they approach zero at different rates as t 
becomes infinite, and do not obey the requirements of the corollary of Theorem 
2 because neither vanishes throughout the infinite intervals for which | ( | > c. 
But the following theorem and corollary will take care of this and of similar 
cases, It will be observed that Theorem 3 includes Theorem 2 as a special case. 

Theorem 3. Using the foregoing notation, let all of T -space be divisible into 
two parts Qo and Qi satisfying the following conditions. In Qo let •pi(T) — MT) = 
S(T)t(T), and let | S(T) | < t. LetT = Q a - f- Q x and 

f UT)dT + f \[/(T)dT < ei . 

•'Oi J 0l 
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Then 


[ Di(X)dX - f D(X)dX 


— ( f ^(X)dX + £1 < e + «i . 


Proof. As before, 


[ DidX - f DdX = [ (f ! - +)dT = f (ft - tfdT + [ (ft - ftdT 

J P ■'« Jjo Jj! 

I f Dydx - / Drfjl < I [ (ft - ftdrl + 1 f (ft - *)dr| — 1 + ii. 

I •'P I I II fti I 

I < e f DdX < e. 

J p 

II < [ hdT+ f fdT < [ hdT+l tdT< iu 

•'fll *'01 •'Ql 

These inequalities together prove the theorem. 

Corollary. In particular, let ft ft , and x be as in the corollary of Theorem 2, 
except that now, instead of requiring ft(0 to vanish when j t ( > ewe shall let Q x 
and f X be so chosen that 


[ ft(h) • * * 4>i(QdT + f (p(ti) ■ • - <f>{t„)dT < «i . 

J 0i J Qi 


Then 


I [ Di(x)dx — f D{x)dx < t f D{x)dx + «i < 

"fl Ja Ja 


+ *1- 


As before stated, the inequalities of this paper apply to all statistics for which 
the integrals involved exist. It seems probable that closer inequalities could be 
devised by placing appropriate restrictions on the g functions which define 
these statistics. 


REFERENCES 

[1] Burton H, Camp, "Methods of obtaining probability distributions," Annals of 

Math, Stat , Vol, 8 (1937), pp, 90, 91 

[2] Burton H. Camp, "Notea on the distribution of the geometric mean," Annals of 

Math Slat., Vol. 9 (1938), pp. 221-226. 



AN EXPERIMENTAL DESIGN FOR SLOPE-RATIO ASSAYS 

By C. I. Bliss 

Connecticut Agricultural Experiment Station and Yale University 

1. Summary. When the response to a drug is a linear function of arithmetic 
dosage units, the relative potency of two preparations can be computed as a 
slope-ratio assay. Their dosage -response curves are computed by solving three 
simultaneous equations to obtain the common intercept o', the slope of the stand- 
ard, bi , and the slope of the unknown, 5 S . The method is applicable to certain 
microbiological assays for the vitamins. Usually several unknowns are assayed 
at one time with a single standard. Their calculation is simplified when such 
assays meet the following requirements : (1) restriction of treatments to the zone 
within which the response is related linearly to the dose, (2) equal spacing of 
doses on an arithmetic scale beginning with the negative control, (3) an equal 
number ( h ) of doses of standard and of each unknown and (4) r replicates for 
each dose of unknown, h' replicates for the negative control and h replicates for 
each dose of the standard. 

2. Method of Analysis. The design and analysis of assays for measuring drug 
potency has been developed largely about the linear relation between response 
and the logarithm of the dose of many drugs. An alternative procedure is 
available when some measure of the response is related linearly to arithmetic 
dosage units, Recently Finney [5] has applied the technique to microbiological 
assays of the vitamins. The relationship is also suitable for experiments with 
toxic agents on micro-organisms, where the length of exposure to treatment is 
the dose, Since potency is measured from the ratio of the slope of the dosage- 
response curve for an unknown to that for the standard preparation, Wood [6] 
has termed the method a “slope-ratio assay.” 

The validity of quantitative biological assays depends upon a qualitative 
similarity between the standard and the active agent of the unknown. When 
the response is related linearly to the log-dose, this is determined by testing the 1 
parallelism of the lines fitted separately to the results for the standard and to 
those for the unknown preparation. If the departure from parallelism is within 
the sampling error, the combined slope is determined from the data on both 
preparations and used in computing potency and its error, The analogous test 
in slope-ratio assays is the convergence of the lines relating response to arith- 
metic dose at zero content of drug, using drug as a generic term which includes 
vitamins, poisons and physical agents. When the curves for the standard and 
the unknown are computed separately, their zero intercept should agree within 
the experimental error, In assays meeting this requirement, the curves are 
computed so that they are forced to intersect at zero dose. The curves 


2/i = a' -f biXi 
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and 

V2 = o! 62X2 

are fitted by solving three simultaneous equations to obtain the three statistics, 
a ' , b 1 and b% which are the best estimates of their respective parameters. Finney 
[6] has illustrated the technique with data from the microbiological assay of 
nicotinic acid and given a suitable test for convergence as well as the error of the 
estimated potency. 

The calculation described by Finney is flexible but not adapted for routine 
use. With certain restrictions in design, the calculation can be reduced to a 
practicable form for the assay of (m — 1) unknowns against a standard prepara- 
tion. These restrictions are as follows: 

1. Doses both of standard and of unknowns must fall within the range for 
which some function of the response is related linearly to an arithmetic scale of 
dosage units with convergence at zero dose. 

2. Within this range the doses (x) of standard and of all the unknowns must 
be spaced similarly and preferably equally on an arithmetic scale, beginning 
with the negative control (x = 0) . 

3. The doses of each unknown must match those of the standard in respect 
to both number (k) and their expected potencies, so far as the latter can be 
judged in advance. Within an assay group there may be h' replicates of the 
negative control, h replicates of each dose of the standard and r replicates of each 
dose of each unknown. 

4. Some element of randomization must be introduced within an assay group 
in respect to the preparation of the tubes, their handling and the reading of the 
results. Replicates of any given dose or of the negative control must not be 
prepared together. 


3. Computational Procedure. The simplified calculation of potency and its 
error depends upon substituting the assumed for the actual doses. When 
spaced equally on an arithmetic scale, they may be coded by using the numbers 
1, 2, 3 , ■ • ■ k, k being equal throughout the assay. The sums of the coded 
doses, Si , and of their squares, Si , are then the same for each preparation and 
may be entered in the equations for computing the inverse matrix, of which the 
first three are 


(1) 


Ncai + hSicu + rSic i% fi- 
ll Sico; -(- hS^cu 
rSiCti fi- rStfu 


i = 0 i = 1 i = 2 

1, 0, 0, • • • 

0 , 1 , 0 , ••• 

0, 0, 1, • • ■ 


where the total number of observations is N = h' fi- kh fi- kr(m — 1) . 
plying the last two rows by — Si/ S 2 and adding the products, wC have 


,.r hSl , , v rSi 1 . 

N - -g- - (m - 1) = l, 




Multi 
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•where the subscript i refers to the standard and the assay includes 2 to m unknown 
preparations. Substituting 

D = NSi — hSi - r{m - 1)5* , 
this leads to the following reciprocal coefficients: 


Am — S 2 /D 

Co, = C .0 = ““ $i/D t 1> 1, 2, ■ • • 171* 

C 11 = 1 /I 1 S 2 + Si/DSi 

c t , = l/riS '2 -j- S 1 /DS 2 , 1 — 2, 3, • • ■ m, and 

c i{ = St/DSi for i, j = 1, 2, • ■ • m, where i ^ j. 

The reciprocal coefficients are computed from the sums of the doses and their 
squares, which are the same for all preparations. The doses are multiplied by 
the responses observed at each dosage level to obtain T, = S(xy/) for any given 
preparation. For the standard there will be h responses at each dose and for 
each unknown r responses. Let T = S(T,) be the sum of these products over all 
m preparations. The total response for all N observations S(y), including the 
negative control, the standard, and all the unknowns, is designated as T v . 

Using normal regression theory, the common intercept is computed as 

a' — cqoT v -f- CoiT. 

Substituting the above reciprocal coeficients, 

(2) o' = (SiT v - SiT)/D. 

The slope of the standard is computed with the reciprocal coefficients as 
bi = CoiTy + CnTi -f- CuT — CuTi . 

We may take advantage of the identities 


& 

Col = — Coo 
02 


and 


Si 

Cli — ~ jT Cot 
O 2 


to obtain 


bi = ( cu - c u )Ti - f - 1 a' 


reducing to 

(3) 


h = Tl _ a !3i 

1 hS 2 Si 


Similarly the slope of each unknown is equal to 

h = Co,T v + + caT , + c„T - cM + T,} 
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where i, j 

(4) 


2, 3, "‘m and j ; * L Since c u - c ti = 0, this may be reduced to 


6, 


T ± _a% 

rS 2 S 2 ’ 


i = 2, 3, ■ • • m. 


The computation is further simplified if the k doses of all preparations are 
spaced not only similarly on an arithmetic scale but also at equal intervals. 
In this case 


Si = fe(fc + l)/2 and & = k(k + 1)(2 k + l)/6. 

Substituting in equations (2), (3) and (4), the common intercept, the slope of 
the standard and that of each unknown may be computed as 

/ = 2(2fc 4- l)T v — ; 6T 

N(k - 1) + 3 h'(k + 1) 

< 6 > b ' - 

In computing the slope for each unknown in an assay the only variable is T, . 
The intercepts and the slope can be checked by substitution in the equation 

(8) 2 Na 1 -f- hk(k -f- l)6i -f- rk(k -f- 1)(&2 H - * * * ~b &m) — 2 T v . 

In terms of coded doses, the potency of an unknown (t) relative to that of the 
standard (0 is computed as 


(9) 


j: = 


h 

h' 


Each J' is converted to original units by multiplying it by the ratio of the dosage 
intervals, I,/I u , the potency being 


( 10 ) 


= hh 

Mu' 


The variance measuring the distribution of the observations about the m 
lines may be determined as 


(ID 


2 _ S(y 2 ) - aT y - hT, - • ■ ■ - b m T, 
] N — m — 1 


The variation about the individual lines is assumed not to vary from one prepa- 
ration to another. This is more likely to be true when the assumed potencies 
differ but little from those computed from the assay, so that J' differs relatively 
little from unity. 

The confidence limits for potency as estimated from the ratio of the slopes 
may be computed from Fieller’s basic formula [4], For confidence limits, X L , 



m 
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at an appropriate level of significance, such as P = 0.05, t is read from the Stu- 
dent-distribution for N — m — 1 degrees of freedom and entered with s ! from 
equation (11) in the equation 

(12) X\{b\ - cijsV) - 2 X L (hb, - Cu&l 1 ) + (b; - cjt 1 ) < 0, 

where i indicates one of the 2 to m unknown preparations. When solved for 0, 
the limits may be written 

(13) 6ib, - Cu^t 2 

_ 2 ,2 
Pi ” CuS l 

± st V (cn — ci,)b* 4- (c.. - Ci,-)b? + c»(bi - hi) 1 — (cnc« — 

b? — cus 2 ^ 

where Cu — c s < = l/hS 2 , c<, — cu = 1/rSz and c u c„ — Cj.< s = — ■ - — . 

In all critical cases, the exact limits should be computed. 

In most slope-ratio assays the individual slopes differ very significantly from 
zero. Under these circumstances the approximate limits may be computed 
with reasonable accuracy from the variance of the estimated potency by the 
familiar formula for the variance of a ratio [1]. 

in ji\ _ b, s / cn , Ci, 201(1 

,y> --Er vr + srra; 

(^) 2 

= ((cii — Ci()bJ -f- (c« — cu)b[ -f- cu(bi — 6<) a }. 

The discrepancies between the approximate and the exact limits are evident 
from a comparison of equations (13) and (14). When the doses are spaced at 
equal arithmetic intervals, equation (14) can be reduced to the more convenient 
form 

njrt A _ 6s s _ / h + rJ' 2 , 3(1 - J'f \ 

J ' b\(2k + 1 )\rhWc + 1) "*■ N(k - 1) + 3h'(k + 1)/ ‘ 

A major limitation to slope-ratio assays is the frequent curvature in the rela- 
tion between response and arithmetic dosage units. For this reason it is advis- 
able to use routinely four or more doses of each preparation. Occasionally an 
assay in which there is curvature at the highest dosage level may be salvaged by 
computing the potencies from the data of the smaller doses. The agreement of a 
given assay with the postulate upon which it is based may be tested objectively 
by an analysis of variance, segregating the sums of squares (a) for the agreement 
of the negative control with the intercept, (b) for the agreement of the individual 
curves at the intercept, (c) for agreement of the observations with straight lines 
fitted individually and (d) for the variation among the h replicates of the stand- 
ard, the h' replicates of the negative control and the r replicates of the unknowns. 
The calculation of such an analysis is greatly facilitated by the recommended 
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design- Since it follows the usual pattern, it will not be described here. The 
procedure has been tested with the data from an experiment on the depth dose 
of x-rays [2] and has been applied to microbiological assays [3] in papers where 
the reader will find the technique exemplified. 
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NOTES 

This section is devoted to brief research and expository articles, notes on 
methodology and other short items. 

COMPUTATION OF FACTORS FOR TOLERANCE LIMITS ON 4 NOR- 
MAL DISTRIBUTION WHEN THE SAMPLE IS LARGE 1 

By Albert H. Bowker 
Columbia University 

In their paper [1], Wald and Wolfowitz discuss the problem of finding tolerance 
limits of the form i± \s for a normal distribution. They propose the following 
large sample formula for X which appears to be satisfactory for all practical 
purposes for A ^ 21 



where N is the number of observations (n = N — 1), 7 is the tolerance coeffi- 
cient, /3 is the confidence coefficient, r is defined by 

1 P'v'Sj+r 

~7 ==• I _ e dt » y 

V2wJ 

and x} bas the property that P(\ >xl)=P for n degrees of freedom. To compute 
X, tables [2] or known approximations [3] for xl are customarily used, but the 
computation of r, even for large N, is tedious, involving an iterative procedure. 
The purpose of this note is to obtain an expansion of r in terms of 1 /s/H and to 
combine this expansion with a known one for xl to obtain an asymptotic formula 
for X. 

To derive a large sample formula for r, consider the function 

® ^ - wjy mdt - - ° 

where for convenience and r are replaced by x and y. It is desired to express 
V as a power series in s. Let yt> be defined by /(0,y 0 ) = 0. Since f(x,y) is a con- 

1 This paper reports work done in the Statistical Research Group, Division of War Re- 
search, Columbia University, under Contract OEMsr-018 with the Applied Mathematics 
Panel, Rational Defense Research Committee, Office of Scientific Research and Develop- 
ment. The work was first reported in an unpublished memorandum, “Computation of 
Factors for Tolerance Limits when the Sample is Large” (SRG No. 560, September 24, 
1945) A brief account of the application of tolerance limits, including tables, will be 
published in Techniques oj Statistical Analysis described in the footnote on page 217. 
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TABLE 1 


Comparative Values of Exact and Approximate X 


v 


50 

100 

M0 

\ 

y 

Exact 

Approx- 

imate 

Dif- 

fer- 

ence 

Exact 

Approx- 

imate 

Dif- 

fer- 

ence 

Exact 

Approx- 

imate 

Dif- 

fer- 

ence 

.75 

.75 


1.25147 

.00333 


1.21698 




.00053 


.95 

2 13774 

2.13226 

.00548 

2.07533 





.00089 


.999 

3.58821 

3.57979 

.00842 


3.48112 



3.43563 

.00141 

.95 

.76 

1.39621 

1.38467 



1 30670 




.00182 


95 

2 37866 

2.35921 


2.23279 

2 22635 


2.16728 

2.16420 

.00308 


.999 

3.99259 


.03179 

3.74835 

3.73776 

01059 

3.63850 

3.63341 

.00509 

.99 

.75 

1 51184 

1.48901 

.02283 

1.38251 

1.37511 

00740 


1.32215 

00351 


.95 

2,57565 

2,53698 

.03867 

2 35546 

2.34290 

.01256 

2.25865 

2 25268 

.00597 


999 

4 32326 

4.25926 

06390 


3 93343 

02086 

3.79189 

3.78196 

.00993 


Comparative Values oj Exact and Approximate X— Continued 


v 

m 

500 

800 

1000 

\ 

■ 

Exact 

Approx- 

imate 

Dif- 

fer- 

ence 

Exact 

Approx- 

imate 

Dif- 

fer- 

ence 

Exact 

Approx- 

imate 

Dif- 

fer- 

ence 

.75 

.75 

1.17733 

1.17724 


1.17126 

1.17122 


1.16891 

1.16888 



.95 




1.99559 

1.99552 

3 

1 99158 

1.99153 



.999 

3.36769 

3.36744 

1£3 




3.34361 

3 34352 


.95 

.75 


1.21470 



1,20047 


1.19502 

1.19491 

,00011 


.95 

2.07013 




2.04536 



2.03589 

.00019 


.999 

3 47647 

3.47459 


3.43433 

3.43390 


3 41831 


.00031 

.99 

.76 

1.24268 

1.24208 


1.22198 

1.22169 


1.21395 

1.21374 

00021 


.95 

2.11727 

2 11626 



2 08152 




.00035 


.999 

3.55462 

3.55292 


3 49543 

3.49460 


3.47244 

3.47186 

00058 


tinuous function of x and y. and since 

oy 


x «*0 

tf - 1/0 


0, the function y{x) defined 


d l 

(i'll dcr » ' 

implicitly by (2) is continuous. Since — = — = tanh xy, the higher deriva- 

dy 

tives of y{x) exist and are continuous and y(x) permits of a finite Taylor’s ex- 
pansion. The coefficients of odd powers of x drop out and we obtain 

y = yo+ + — jj — % + °i* )* 


2 ! 
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oil returning to the original notation and retaining terms in 1/N, 


(3) 



1 r * f 

If x p is defined by — £ 


e lil2 dt — p we know from [3] that 


(4) 


Xe i , \/2 xt-d , 2xi-p - 1 

+ v'i * » 


Proceeding formally and retaining terms in 1/A r we obtain 



Xi-p . 4 + 5ui_A 

vm i2A^ ) 


and multiplying by the expression for r given by equation (3) we find the desired 
expansion for X. 


( 5 ) 


X r ^ r Too 


(‘ 


5x1-0 4 * 10 \ 

\/22V + 12? ' ) ' 


Recall that both r* and xi~$ are readily obtainable from tables of the normal 
curve, in fact, r v is defined by 


J e ,>n dt ~ 7 and Xi-# is defined by £ e~‘' n dt = 1 — /3. 

A comparative table of approximate and exact values of X is given in Table T 
From the table we see that for N Sr 800 the error is less than 1 in the 4th sig- 
nificant figure, and for N 160 the error is less than 1 in the 3rd significant 
figure within the limits of /3 and 7 considered. The approximation will be less 
exact for higher values of /3 and 7 . 
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THE PROBABILITY DISTRIBUTION OF THE MEASURE 
OF A RANDOM LINEAR SET 

By David F. Votaw, Jr. 

Naval Ordnance Laboratory 

1. Introduction. Consider a random sample 0 n (xi , • • ■ , x n ) of n values of a 
one-dimensional random variable x having cumulative distribution function 
F(x). Let there be associated with each x an interval of length D centered at x 
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(D a positive constant) . Let 5(0*) denote the random set which is the point-set 
sum of the n intervals associated with 0* ; 5(0*) is a set of one or more intervals. 
Let 8 denote the measure of 5(0„) ( S is the sum of lengths of the intervals 
composing 5(0*)). Given F, n and D, what is the probability function of S? 
This note contains a solution of the problem for F( x) = x, (0 < x < 1) ; the case 

of F(x ) = / He dt, (0 < x < «> ; H > 0), is also treated. 

Jo 


2. Sampling from a uniform distribution. Let y = S - D. The range of 
y is 0 < y < m, where m denotes the minimum of 1 and (n - 1 )D. Let Xi , 
■ ■ ■, x n be the sample values arranged in increasing order of magnitude. Make 
the transformation 


(2.1) 


2/o = xi 

2/> = z.+i - a:*, (i = 1, •••, n - 1). 

it— 1 

can be expressed as X m (u< , H), where m(y, , D) denotes the minimum of 


y 


a — 1 / 

, and D. The probability function ol fa, y lt • • y n ^) is ?i! II d V* > ( 2/« £ 0; 

u-0 \ 

n-1 \ 

X J/u < 1 )■ If m = (n — 1 )D, then y * (n — 1)D if and only if y<>D, (i = 1, 

u-0 / 

• ■ ■, n - 1) j for a fixed y 0 it can be shown by use of the Dirichlet integral that 
the volume of the (n — 1) dimensional region in which any point (yo,yi, • • • , 

j/„_i) satisfies this condition is — — ■ It follows that: 


(2.2) 


Pr 


\y « (ti — l)DJ = n / [1 - 5/0 - (n — 1 )D] n 1 dyo 

JVtmO 


JV,m0 

“[!-(»- OT, 


((n - 1 )D < 1). 


The probability that Y < y < Y + AY (where Y < m and AY denotes an 
arbitrarily small positive increment in F) can be evaluated by determining 
volumes of certain regions contained in the tetrahedron defined by y u > 0, 

n-l 

X Vu < 1. Consider the following conditions: 

U-Q 


(a) qD < Y < ( 5 + 1)D 


(b) Vu > D 

(c) X 2/« < 1 " 

v»0 

(d) y, < D 


(5 «■ 0, 1, • • *, M\ M denotes the minimum 
of (n — 2) and the greatest integer less 

(u « 1 , < q), 


than 


V + iA 


(i = j + lj ■'*, n — 1 ). 
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The probability that Y < y < Y + AFand that (b), (c) and (d) are satisfied is: 

(.r+ir [ I-* dy 

(2-3) nl |_ r 

where A fa, y») denotes the j dimensional volume of the region in which any 
point (yi, yb satisfies (b) and (c), and Bfa) denotes the (n - j - 2) 

dimensional volume of intersection of the hypcrplane £ y v - y - jD with an 

(n - j - 1) dimensional cube (0 < y v < D), It is clear that if any other of 

the 7 ^ combinations of j y ’ s out of the set of (n — 1) y ' s had been specified 

in (b) and the (n - j — 1) complementary y’a had been specified in (d), the 
corresponding A fa, yf) and Bfa) would be equal to those given in (2.3) ; hence 

Pr [Y < y < Y + AF} = nl g 7 Bfa) 

(2-4) . f 

Jy 

qD < Y < (q+ 1)B, Y <m, (g = 0, I, 

A fa, y<s) = ^ - -. y ° ~ . ^ J and (see [1] and [2]) 

(2-=) Hy) - § <-»' (*T>- + Ol^-. 

From (2.4) and (2.5) it follows that the probability function of y, say ffa), is: 

•( n “; !“ *)(! ~v) s+l [y - DU + r)}' 


■ l -r dy 

l(iJ a 


- 1’ 
,M). 


(2.6) 


1 n-y-2 


g£ < y < (g + 1)A (g = 0, • • ■ , ilf), ^ < m, 

ffa) is not defined at (n — 1 )D if ( n - 1 )D < 1 (see (2,2)) ; if m = 1, the range 
of definition of ffa) as given in (2.6) is y < ] . 

The cumulative distribution function of y is continuous with the exception, 
in the case of (n - 1 )D < I, of a saltus of amount [1 — (n — 1 )D] n at y - 
(n - 1)D (see (2.2)). The probability function ffa) is continuous over the 
range 0 < y < m with the exception, in the case of n > 3 and (n — 2 )D < 1, 
of a simple discontmuity at y = (n — 2)D. 

For n = 2 and D < 1, 


My) = 2(1 - y), 


(0 < y < D ), 
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andPrfy = D\ = (1 - D)\ 

For n = 3 and 2D < 1, 

My) = 6(1 - y)v , 

My) = 6(1 - »)» - 12(1 -y)(v-D) + 6(1 - yf 

and Pr \y = 2D} = (1 - 2D) ! . 

The expected value, say E(y), of y is: 


(2.7) 


m " (T+T) 11 - (1 - z) )‘ + ‘ ) 

„ (jlz n 
(»+ 1) 


(0 < y < D), 
(D <y < 2D), 


(D < 1); 


(P > 1 ). 


The expected value of S is D + E(y). E(y) can be derived by use of (2.6) 
or by use of a theorem of Robbins [3]. 

3. Probability that random linear set covers range of variate. Given that 
F(x) = x, (0 < x < 1), and nD > 1, what is the probability, say JP D , that 
3(0„) contains the interval (0 < x < 1)? If D < 1, the interval is covered 
if and only if (i), (ii) and (iii) below are all satisfied: 


( 1 ) 

(ii) 

(iii) 


1/u < D, 

n— 1 / 

> (i 

u-l \ 

. D 
Vo < g- 


y» 


-()■ 


(u = 1, - 1), 


„P D can be expressed as follows: 

r-D/2 (-1-V0 

(3.1) .Po - »! ) l 


C- 1(«) 


dz 


dyo , 


'tf o— 0 Jj—il— Vq~Dl1 a/ n 1 

where C n ~i(z) (see (2]) denotes the (n — 2) dimensional volume of the intersection 

n— 1 

of the hyperplane X) i/ u = z with an (n — 1) cube 0 < y u < D. It follows from 

u-l 

(2.5) and (3.1) that 

li/oi /„ „ i\ 

nP D - £ (-DM u )(1 “UD) B 

u “° V 7 tu/w-(l U _ 1 \ / D\ n 

(3.2) -2 S (-!)■(• 2) 

+ “'g " (-ir (” ; - ■»>•. 

where D < 1 and [x] denotes the greatest integer less than x. If 1 < D < 2, 
/ D\" 
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4. Sampling from F(x) — f Be~ s> dt, (0 < x < «> ; H > 0). If F(z) = 

Jo 

/ He~ ni dt, the probability function of S can be determined but is very cumber- 
Jo 

some in the form in which it is known to the writer. The characteristic function, 
say of the probability function of S will be given instead. By use of (2.1) 
it can be shown that: 


(4.1) 


m - n ~ x " 


x-1 


id - \H 


where i = \/~l- 

The expected value, E(S), and variance, <rl, of £ are: 


(4.2) 


m = d + ~ £ 

H x-i 


1 w— 1 /-j — Di/X\ 

1 (1 — e ) 




a _ 1 v (f ~ e 
3 H 1 H X 2 


X 

-20BX 


) _ 2D "yl f 0 ™ 
H ti \ 
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INFORMATION GIVEN BY ODD MOMENTS 

Bt Edmund Churchill 
Rutgers University 

The widespread use of the third moment about the mean as a measure of skew- 
ness and the belief engendered by this use that a distribution is symmetric if its 
third moment is zero prompt the question of how much information about a 
distribution can be deduced from a knowledge of its odd moments. An answer 
to this question is: Dei Fix}, a cumulative distribution function; {/iin-i) , in ~ 1, 
2, ■ • ■), a sequence of real numbers; and « > 0 be arbitrary. There exists a c.d.f 
F (x), having as odd moments the terms of the given sequence and such that 

(D | F(x ) - F*( x) | < «, all x. 

If the mean of Fix) is equal to pi and the variance of F(x) is not zero, it can be 
shown that F*(x) may be chosen so that in addition the variance of F*(x) is 
equal to that of F(z). 

An immediate consequence of our statement is that a distribution need not be 
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symmetric even though all its odd moments vanish. Such an asymmetric distri- 
bution, due to Stieltjes, is given by: 

(2) dF(x) = 1/48 c -U|1 - (1 - k sin |x|‘) dx, - » < & < •», k - -1 if x < 0, 

k = 1 if x > 0. 

The proof of our statement will follow easily from the following: 

Lemma. Let {fth-v-il , a sequence of real numbers be given. There exists a c.d.f. 
having as odd momenta the given numbers. 

We construct a sequence {//„| of increasing step-functions in such a manner 
that for every ft, the first n moments of H n are the first of the given numbers, 
and such that this sequence converges to a monotone function having all the 
desired moments. A slight modification of this function will give the desired 

C.d.f. 

Let lh be identically zero. We form Hi by adding to a jump or mass of j 
at x = 2mi . In general, Ih is formed from f/*_i by adding to it k masses chosen 
so that their first (k - 1) odd moments are zero and so that the Ath odd moment 
of H k is mn-i . This we do by adding the masses | x, | , O' = 1, 2, • • • , fe), at the 
points e,jp where the sc /a are the solutions of: 

pan + 2px s + • • • + kpxt, = 0 

p\i + (2p)*xi + • • • + (kp)*x k = 0 


+ (2p) Sk " , i» + • ■ ■ + (kp) u - l Xk = 0 

P^xi + (2p) 3i " l x 3 + • • ■ + (kp) n ~ l x k = mji-i - m{Hk-i), 

m(Hk~ i) is the fcth odd moment of H *_i , ei is the sign of x, and p is a parameter. 
Since the determinant of this system is a Vandermonde determinant, there exists 
a unique set of solutions for every non-zero value of the parameter. The masses 
thus chosen clearly have the specified moments. Eliminating p from the left 
aides of the equations by division, it is apparent that the x/s are all linear func- 
tions of p~ lu ~ l \ Thus we may choose p bo large that the sum of the masses 
added at this step does not exceed 1/2*. The absolute odd moments of orders less 
than {2k - I) of these k masses are also linear functions of negative powers of p. 
We may thus insure by further increasing our choice of p that the {2k - 1 - 2r)th 
absolute moment of Hi does not exceed the corresponding moment of Hi- 1 by 
more than 1/2'. For definiteness, we choose p as the smallest number satisfying 
these requirements. 

The first of these restrictions on p insures that for each value of x, the sequence 
H„(x) is increasing and bounded from above by one. The sequence of functions 
thus converges to a monotone function H*(x) with the property that H*{— *>) 
= 0, H*{ ») < 1 , The other restrictions on p insure that the sequences of abso- 
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lute odd moments of all orders are uniformly bounded, a bound for the abso- 
lute moments of order 2k - 1 being one greater than the absolute moment of 
this order of Ht . This in turn insures that the odd moments of H*(x) exist and 
that they have the desired values, By adding a jump of 1 - H*(<n) at the 
origin we obtain H(x), a c.d.f with the given odd moments. 

The -mam statement of this note is an immediate consequence of the lemma. 
Let the ifcth odd moment of F(x) be Af 2 *_ j , which we assume to be finite, and let 
the sequence [mo- 1 ) be defined by the relationships: 

as*-i — (1 — , (k — 1,2, • • • ). 

Let H(x) have the m’a as odd moments. The c.d.f. F*(x) defined by 
F*(x) = (1 - t)F(x) + tEix) 

clearly has the properties stated above, and our statement is proved. If the 
moments of F(x) are not all finite, the proof will need only minor modifications. 

If one asks in addition that F* have a finite range, F* will, in general, not 
exist. If, for example, the range of F is finite and its odd moments are zero, 
then F must be symmetric about the origin, for F* defined by dF*(x) — dF(—x) 
would have the same moments as F. But a c.d.f. with finite range is determined 
by its moments; hence F(x ) = F*{x). 


SOME ORDER STATISTIC DISTRIBUTIONS FOR SAMPLES 
OF SIZE FOUR 

Br John E. Walsh 
Princeton University 

1. Summary. Let Xi,Xt,Xt, x 4 represent the values of a sample of size four 

drawn from a normal population. There is no loss of generality in assuming 
that the distribution function of this population has zero mean and unit vari- 
ance. Denote it by JV(0, 1). Let be the ith largest of x \ , , x 3 , x 4 . The 

purpose of this note is to determine the joint distribution of 

*<*> + ~ *<s) _ + *( S ) - x ( i) , and x (4j - X( 3) - x w -f , 

and derive from this joint distribution the joint distributions of these statistics 

taken in pairs, also the distribution of each statistic itself. 

2. Analysis, Consider the joint distribution of 

n = + x 3 - x 3 - Xi) 

h = \{x 4 - x 3 + x 2 - ^ 

n = \{x 4 - Xi - Xt + Xi). 
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Evidently, 

E{n) = 0, (i = 1, 2, 3). E(r t r,) = 0, (i ^ j). E(r}) = 1. 

Hence the r< are independently distributed according to IV (0, 1). 

Let !>,• be the jth largest of | r x |, | r 2 j, | r 3 1. Then by first finding the joint 
distribution of | ri |, | rj |, | r 8 | and then applying the distribution for order sta- 
tistics [1], it is easily seen that the joint distribution element of «i , t> s , v 3 is 

48/(1;!)/ (vi)f(vi) dvidvidvz , 

where 

1 i , 

f(y) = ^7^; e » o < vi < v 2 < v 3 . 

Examination shows, however, that 

!>3 = + 3 - 5 ( 8 ) — 35 ( 2 ) — 35 ( 1 )) 

«2 = — x m + x (i) — x m) 

th = \ | x (i) — 35(3) — X(2) + £(1) I 
Let 


ms = 35(4) + 35(3) — 35(2) — X(l) 

mj = 35(4) — 35(3) + 35(2) — 35(1) 

mi — 35(4) — 35(8) — 35(2) + 35(1) . 

Then the joint distribution element of jm-i |, nh and m 8 is 

6/(| [ mi l)f(inh)f(jm 3 )d | mi | drrkdms . 

Since the function / is symmetrical about the origin, it follows immediately that 
the joint distribution element of mi , ms and m 8 is 

3/( ^mi)/( %m 2 )f(hms) dmidrrhdm 3 , 

where | mi | < m^ < m 8 . 


3. Derived results. By taking marginal distributions it is found that the 
joint distribution elements of mi , m? and m 8 taken in pairs are 


0i(wii , m 2 )dm : 


,i dms — 3 ^ J 


f($v)dy) f(}mi)f($m)dmi dms . 


ff»a 

mj 


$2 (mi , m 8 )dmi dms = 3 (^J f(hv)dyj /(|mi)/(^m 8 )dmi dm 3 


0 8 (m 2 , m 8 )dm2 dm 3 


-(T 


f(iy)dy)f(^mi)f(^m3)dm3 dms . 


/ 
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The distribution elements of nu , vk and mi are seen to be 
= | ! KWyj /(Jmi)dmi , 

g t (m)dm 2 = /(iy)dyj^ f(fo)dyjf(}mi)dmi 


r m i 


(hMdmi = 3 f(h/)dyj /($m a )dm a 
It is to be noted that if a > 0, 


Pr(0 < mi < a) ~ Pr(-a < »ii < 0) » § - 4 ^ /(j/)cfyj , 


/ tati \i / pa/2 \s 

Pr(0 < mi < a) = 12 U f(y)dy) - 16 U /(y)cfyj , 

/ pa/2 y 

Pr(0 < m 3 < a) = 8 f J f{y)dy\ s 


so that the probability that any of m* , m 2 , m a lie between two given numbers 
is expressed explicitly and can be calculated with the aid of standard tables for 
the normal distribution. 


4. Generalization of method. The method used to obtain the joint distribu- 
tion of the order statistics mi , nk and m a was to take all possible combinations of 
4 variables with two plus and two minus signs (except for factor of - 1) and 
show that these combinations behave as normally distributed independent 
variables. The question arises as to whether this method of finding order sta- 
tistic distributions would apply in general to 2n variables with n plus and n 
minus signs, It is easily proved that this will occur only when n = 2. 

REFERENCES 

in S. S. Wilks, Mathematical Statistics, Princeton Univ. Press, 1943, p. 90. 



NEWS AND NOTICES 

Readers are inviled to submit to the Secretary of the Institute news items of interest 

Institute of Statistics of the University of North Carolina 

Announcement of detailed plans for the North Carolina All-University Insti- 
tute of Statistics has been made by Professor Gertrude M. Cox, Director of the 
Institute. 

To provide graduate-level training for students in statistics and to combine 
the theoretical or mathematical statistics with applied or experimental statistics, 
a Graduate Department of Mathematical Statistics is being set up at Chapel 
Hill with Professor Harold Hotelling as Head. The existing Department of 
Experimental-Statistics at Raleigh is a part of the Institute, and will be headed 
by Professor Gertrude M. Cox with Professor W. G. Cochran as Director of 
Research. Professors Hotelling and Cochran will be Associate Directors of the 
Institute. 

Professor Hotelling, who will head the Department at Chapel Hill comes to 
North Carolina from Columbia University, where he has been directing its 
graduate mathematical statistics program. Previously, he had held positions 
with the University of Washington, Princeton University and Stanford Uni- 
versity. His undergraduate training was taken at the University of Washington 
where he majored in journalism; his Master of Science degree Avas awarded by 
the same institution in mathematics; and his doctorate by Princeton University, 
also in mathematics. In addition, he has done some graduate work at the Uni- 
versity of Chicago. Professor Hotelling’s publications in mathematical statistics 
are numerous and well knoAvn. Among the members of his staff will be a visiting 
professor, M. S. Bartlett, on leave of absence from Cambridge University. A 
graduate of Cambridge and native of England, Bartlett has also held positions 
with the University of London and the Imperial Chemical Industries, and during 
the war was engaged in Avar research in London. 

In addition, P. L Hsu, William Madow, and Herbert Robbins, will be mem- 
bers of the Department at Chapel Hill as associate professors. Hsu, a native 
of China, has held teaching positions with the University of Peking and the Uni- 
versity of London. He received his degrees from Tsinghua University and 
the University of London. 

Madow is now in Brazil, Avhere he is serving as a visiting professor of statistics 
at the University of' Sao Paulo. He received his training, both undergraduate 
and graduate, from Columbia University, and has worked with the Department 
of Agriculture Graduate School and the Bureau of the Census in Washington. 

Robbins will come to the University of North Carolina from New York Uni- 
versity where he has been serving as an assistant professor. Prior to that, 
he was a staff member of the postgraduate school of the U S. Naval Academy, 
and an instructor in mathematics at New York University and at Harvard 
University, He holds A.B., A.M. and Ph.D. degrees from Harvard University. 
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The appointment of Edward Paulson as an instructor completes the initial De- 
partment staff at Chapel Hill. A graduate of Brooklyn College and holder of an 
M.A. degree from Columbia University, Paulson has been more recently study- 
ing mathematical statistics at Columbia under a pre-doetoral fellowship of the 
National Research Council. 

Professor Cochran came to North Carolina in March from Ames, Iowa, 
where he had been serving as professor in the statistical laboratory of Iowa 
State College. During the war years he was sent to England, Germany, and 
Austria on special work for the War Department, after spending a year at 
Princeton University where he served as research statistician on war work. A 
native of Glasgow, Scotland, Cochran has been in the United States since 1939, 
and is a naturalized citizen. Before coming to America, he was employed as 
statistician with the Rothamsted Experimental Station in England. Cochran’s 
publications in both the theory of statistics and applied statistics are well 
known, as is his experience with practical research problems. He is serving this 
year as president of the Institute of Mathematical Statistics. He is a fellow 
of the American Statistical Association and a fellow of the Royal Statistical 
Society of England. 

Under the plans of the Institute, students who arc preparing to teach statis- 
tics or to develop statistical theory will take most of their training at Chapel 
Hill. However, work between the two branches will be so coordinated as to 
include instruction in the application of statistics as taught in Raleigh. 

For students who intend to become statistical consultants in various other 
fields, basic training will be taken in mathematical statistics, with the main part 
of the advanced applied training at Raleigh. 

For research students, on both campuses, who arc working in other sciences, 
iucluding agriculture, biology, medicine, psychology, sociology, economics, in- 
dustry, and textiles, training in both basic and applied statistics will be given. 

Working with Cochran in Raleigh are Professor J. A. Rigney; Associate 
Professors R. L. Anderson, J M. Clarkson, II. L. Lucas, and Paul Peach; 
Assistant Professor H. F. Robinson ; Instructors Margaret Fleming, R. I. Monroe 
and Sarah Porter 

Collaborators working with the Raleigh unit are A. L. Finkner, W. A. Hen- 
dricks and F. E. McVay of the Bureau of Agricultural Economics; C. E. La- 
moureaux and G. P. Weber of the Weather Bureau; and D. D. Mason of the 
Bureau of Plant Industry. 


Joint Session of the Institute and Section A of the AAAS 

A joint session of the Institute of Mathematical Statistics and Section A of 
the American Association for the Advancement of Science was held in the 
Municipal Auditorium at St. Louis on Saturday, March 30, 1946 at 2:00 P M. 
At this session invited addresses were given by Lieutenant Commander John H. 
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Curtiss on Statistical Inference and its Engineering Applications , and by Mr. 
Morris H. Hansen on Some Sampling Problems in Surveys of Business and 
Population. 


Personal Items 

Dr. Paul H. Anderson is at present Economic Analyst with the War Assets 
Corporation at Washington. He is also teaching mathematics in the evening 
school of American University. 

Assistant Professor T. A. Bancroft has returned from a teaching position 
at the University Study Center at Florence, Italy, to his position at Iowa State 
College. 

Associate Dean Walter Bartky of the University of Chicago has been appointed 
Dean of the Division of Physical Sciences. 

Mr. Gordon L. Beckstead in working toward his doctorate in statistics at the 
University of California. 

Mr. Donald Cody has returned to his position as Assistant Actuary at the 
Equitable Life Assurance Society after spending three years in war research 
with the NDRC, the Naval Ordnance Station at Indianapolis, and the Naval 
Ordnance Station at Inyokern, California. 

Professor Alien T. Craig, after war service at the Postgraduate School of the 
U. S. Naval Academy at Annapolis, has returned to his position at the University 
of Iowa. 

Mr. James H. Davidson is studying for his doctorate in chemistry at Princeton 
University. 

Associate Professor J. L. Doob of the University of Illinois has been promoted 
to a professorship. 

Assistant Professor Churchill Eisenhart of the University of Wisconsin has 
been promoted to an associate professorship. 

Dr. Wayne Gutzman recently discharged from the Navy as Lieutenant, has 
assumed his new duties as Assistant Professor of Mathematics at the Postgradu- 
ate School, Naval Academy, Annapolis, Maryland. 

Mr. Bernard Hecht has been discharged from the Army and is now Chief 
Quality Control Engineer with the International Resistance Company at Phila- 
delphia. 

Dr. D. G. Humm has been elected president of the Southern California Acad- 
emy of Criminology. 

Mr. Amrom H. Katz is in charge of a group of physicists, engineers, and aerial 
photographers representing the Aerial Photographic Laboratory at Wright 
field, which will record photographically various aspects of the forthcoming 
atomic bomb test at Bikini Island. 

Mr. Edward A. Lew has ben released from active duty and has returned to 
his former position as Assistant Actuary of the Metropolitan Life Insurance 
Company. 
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Dr, E. V. Lewis is Junior Research Associate with E. I. duPont do Nemours 
at the Nylon Research Laboratory at Wilmington, 

Associate Professor M. C. MacPlmil of Acadia University, Wolfville, Nova 
Scotia, has been promoted to a professorship. 

Mr. C. J, Maloney has been appointed to an instructorship in the department 
of mathematics at Iowa State College, 

Dr. Edward B. Olds is director of the Research Bureau of the Social Planning 
Council of St. Louis and St. Louis County. 

Dr. A. M. Peiser has been appointed head of the Statistics Research Group 
at the Langley Field Laboratory of the National Advisory Committee for Aero- 
nautics. 

Mr. Robert J. Saunders has been released from the Army and is now connected 
with Mohawk Carper Mills at Amsterdam N. Y. 

Mr. Benjamin Stauber is now Chief of the Relocation Planning Division, War 
Relocation Authority. He has transferred from the Department of Agriculture 
for this work. 

Mr. Arthur I. SternheU returned from the Army to his position as general 
staff assistant in the Field Management Division of the Metropolitan Life 
Insurance Company. 

Mr. Harry Weingarten has been appointed Tutor of Mathematics at the Col- 
lege of the City of New York, 

Assistant Professor J. R. Vatnsdal has finished his army service and has 
returned to the State College of Washington where he was promoted to an asso- 
ciate professorship. 

Mr. Bertram Yood has completed his duty in the navy and is now at Yale 
Station, Connecticut, 

A symposium on mathematical statistics and probability was hold at the 
University of California at Berkeley, January 28-30, 1946. 


New Members 

The following persons have been elected to membership in the Institute: 

Alchlan, Prof, Amen A., PhD. (Stanford) Univ of Oregon, Capt. ( A.C .) II q. AAF 
Training Command, Ft Worth, Texas 

Bingham, M.D. 1920 S St., N. W„ Washington, D. C. 

Cannon, Edward W., Ph.D, (Johns Hopkins) Comdr., US Navy, Research and Standards 
Branch of Bureau of Ships, Cannon, Delaware 

Carvalho, Prof. Pedro Egydlo, Ph.D (SSo Paulo) Univ, do Silo Paulo, Faculadade do Hl- 
giene, Avenida Dr. Arnaldo SB, Caixa postal 99-B, Sao Paulo, Brazil 

Delsa, Alexis, A. I. Lg. (Liege) Mgr, Basic Bessemer Steelworks, Socidtd Anonyme John 
Cookerill, Seraing, Belgium 

Duncan, David Beattie, B.SO. (Sydney) Graduate Student, Iowa State, Statistical Labora- 
tory, Ames, Iowa 

Fro e Itch, Kathryn, B.A. (Evansville) Statistician, US Dept, of Agriculture, Bureau of 
Human Nutrition and Home Economics, 1806 Monroe St , iV, W., Washington 10, D, C.' 
oldstine, Herman H. Ph D . (Chicago ) Institute for Advanced Study, Princeton, N J. 
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Hammond, Edward Cuyler, Sc D (Johns Hopkins) Major A.C , US AAF, Chief, Statistics 
of Flying Personnel Branch, Office of the Air Surgeon, 4700 Connecticut Ave , Washington, 
D. C 

Hsu, Prof. Pao-Lu, Ph D. (London) Columbia University, 1027 John Jay Hall, Columbia 
Univ , New York City 

Kyle, Garland Dean, M,S. (Michigan) Spectroanalyst, Physicist (US Navy )£848 Filbert, 
Philadelphia 39, Penn. 

Lelbler, Richard A., Ph.D. (Illinois) Instructor, Purdue Univ , Math Dept , Lafayette, 
Indiana 

Lessard, Prof. Roger, C E. (Montreal) Hull Technical School, Hull, Quebec, Canada 

Moslmann, Thomas F., A. B. (Charleston) US Bur Labor Statistics, Regional Employment 
Analyst, 4216 Western Ave., Dallas 11, Texas 

Patte, W. Edmund, B.A.Sc, (Toronto) Stat, Eng , Canadian Industries Ltd , Shawinigan 
Falls, P.Q. Canada, 660 — 16th St , Almaville 

Plza, Prof. ASonso P. de Toledo, Ph.D . (Sao Paulo) EBCola Politechnica, Silo Paulo, Brazil, 
Rua Mmistro Gadoy, 1123 

Rozen, Daniel I., A.B. (Columbia) Stat , Medical Statistics Div., Office of the Surgeon 
General, War Department, Rm 317-1, 3416 S8th St , N. W ., Washington, D C 

Saldel, Frank, M.A. (Michigan State) Instructor in Math , Michigan State, East Lansing, 
Michigan 

Schmalz, William Herbert, B Sc.A. (Toronto) Technical Superintendent, Dominion Rub- 
ber Company Limited, Merchants Rubber Factory, 61 Breithaupt St., Kitchener, Ont. 

Stehn, John R., Ph D. (Wisconsin) Physicist, Research Division, Winchester Repeating 
Arms Co, New Haven, Conn. 

Tsao, Prof. Fel, Ph D. (Minnesota) National Central University, Chungking, China 

Weaver, Chalmers L., B S. (Kent State) Asst. Actuary, New England Mutual life Ins. 
Co , 601 Boylston St , Boston, Mass. 

Weber, C. Jerome (New York) Personal Trust Officer, The Chase National Bank of the 
City of New York, 11 Broad Street, New York City, Chappaqua, New York, Box 63 

Whitney, Donald Ransom, M.A. (Princeton) Grad. Asst , Math Dept , Ohio State Univ , 
Columbus, Ohio 

Wright, C. Ashley, M.A. (Princeton) Econ. Stat , Standard Oil Company, N. J., Box 34, 
RFD 6, Alexandria, Va. 

Yost, Earl K., Jt., B.S (Washington and Jefferson) Grad. Asst., Math., Univ of Oklahoma, 
College Ave , Norman, Okla . 



REPORT ON THE APRIL MEETING OF THE WASHINGTON 
CHAPTER OF THE INSTITUTE 

A meeting of the Washington Chapter of the Institute of Mathematical 
Statistics was held at George Washington University, 'Washington, D, C, 
on Friday and Saturday, April 12 and 13, 1940, in conjunction with a meeting 
of the Washington Chapter of the American Statistical Association. 

More than 100 people attended the meetings including t.he following 61 mem- 
bers of the Institute: 

Theodore W, Anderson, Jr., Richard 0. Been, Archie Blake, David Blackwell, J. B. Bod- 
die, Glenn W. Brier, William Cohen, Jerome Cornfield, John H. Curtiea, Beaaie B, Day, 
Robert Dorfman, Thomas I. Edwards, Andrew Fraser, Meyer A. Girshiek, Clyde H, Graves, 
Margaret J. Hagood, Major Edward C. Hammond, Morris H. Hansen, Alston S. Householder, 
Leonid Hurwicz, Irwin E. Jackson, Jr., Walter Jacobs, Hyman B. Kaitr,, II. S. Konji, Lila F. 
ICnudsen, Colonel S. Kullback, R. B. Ladd, II. G. Landen, Walter Leighton, Gerson Levin, 
Jacob E. Lieberman, Sophie Marcuse, Ethelyne L. McBeo, William J. McCabe, Francis 
McIntyre, Dorothy Morrow, H, W, Norton, W, R. Pabst, Carl J. Rees, David Rosenblatt, 
M. Sandomire, Edward M. Schrock, L, W, Shaw, John II. Smith, Frederick F. Stephan, 
F. M. Wadley, A. Wald, F. M, Weida, Samuel Weiss, S. S. Wilks, C. P. Young. 

The session Friday evening was devoted to the following contributed papers: 

1. Estimation of the Parameters of a Single Stochastic Difference Equation in a Complete 
System, 

T. W. Anderson and H. Rubin, Cowles Commission for Economic Research 
M. A. Girshiok, Bureau of Agricultural Economics 
Presented by T. W. Anderson 

2. Estimation of Linear Functions of Cell Proportions. 

J H. Smith, Bureau of Labor Statistics 

3. On Functions of Sequences of Independent Chance Vectors with Applications to the 
Random Walk Problem in k dimensions, 

D. Blackwell, Howard University 

M. A, Girshiek, Bureau of Agricultural Economics 

Presented by D. Blackwell 

4. The Exact Power Curve and Distribution of n for the Sequential Binomial Probability 
Ratio Test, 

M. A. Girshiek, Bureau of Agricultural Economics 

At a business meeting following the session of contributed papers, Professor 
F. M. Weida and Dr, John H. Smith were elected to succeed Colonel Kullback 
and Dr Madow as members of the Program Committee. 

The program for Saturday morning was devoted to the following invited 
lectures: 

1. Recent Developments in the Measurement of Simultaneous Economic Relations, 

T, Koopmans, Cowles Commission for Economic Research 

2. Structural Estimation versus Regressions • use for Policy and Prediction, 

Leonid Hurwicz, Cowles Commission for Economic Research 
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The program for Saturday afternoon was devoted to the following: 

1. Basic Concepts Underlying Sequential Analysis with Applications. 

A. Wald, Columbia University. 

2. Applications of Sequential Analysis to Acceptance Inspection. 

W. R. Pabat, Navy Department 

Irving Siegel, Veterans Administration, was chairman for the morning session 
and Professor F. M. Weida, George Washington University, for the afternoon 
session. 

A lively discussion followed the presentation of the papers. 

S. Kullback, 

Secretary , Washington Chapter. 




SAMPLE CRITERIA FOR TESTING EQUALITY OF MEANS, EQUALITY 
OF VARIANCES, AND EQUALITY OF COVARIANCES IN A 
NORMAL MULTIVARIATE DISTRIBUTION 

By S. S. Wilks 
Princeton University 

Summary. In this paper statistical test criteria are developed for testing 
equality of means, equality of variances and equality of covariances in a normal 
multivariate population of k variables on the basis of a sample. More spe- 
cifically, three statistical hypotheses are considered: (i) H mve , the hypothesis 
that the means are equal, the variances are equal, and the covariances are 
equal, (ii) H ve , the hypothesis that variances are equal and covariances are 
equal, irrespective of the values of the means, and (hi) H m , the hypothesis of 
equal means, assuming variances are equal and covariances are equal. 

Test criteria L mvc , L vc , and L m are developed by the Neyman-Pearson method 
of likelihood ratios for testing H mvc , H ve and H n respectively. The exact 
moments of each of the three test criteria when the three corresponding hypoth- 
eses are true are determined for any number k of variables and for any size, 
n, of the sample for which the distributions exist The exact distributions of 
L m „„ and L vc are determined for k = 2 and k = 3, and the exact distribution of 
L m is found for any fc; these are all beta (Pearson Type I) distributions Tables 
of 5% and 1% points of L mva , L vc and L m , based on Thompson’s tables of 
percentage points of the Incomplete Beta Function, are given for certain values 
of k and n (Tables I and II) . Also tables of values of approximate 5% and 1% 
points of — 7i In L mvc , — n In L vc and — n(k— 1) In L m for large values of n are 
given (Table III) , based on the fact that these three quantities are approximately 
distributed according to chi-square laws for large values of n with \k{k + 3) —3, 
%k(k -f 1) — 2, and k — 1 degrees of freedom respectively. A table (Table IV) 
is given which shows how accurate the resulting approximate 5% and 1% points 
of L mvc > L c and L m are 

The paper is written in two parts. In Part I the problem of testing the three 
hypotheses is discussed and the mathematical results are presented together 
with an illustrative example. Part II is given for the reader who wishes to study 
the mathematical derivation of the results. 

I. The Problem and a Statement of Results 

1.1. Introduction. Situations occasionally arise, m which it may be desired 
to test the hypothesis that the means are equal, the variances are equal and the 
covariances are equal in a multivariate population m which the variables are 
correlated, the test to be made on the basis of a sample from such a population. 
In the case of a normal multivariate distribution this means testing the hypo- 
thesis that the distribution is symmetric with respect to the variables. 
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As an example 1 suppose three “parallel forms” of a test are constructed and 
all are given to a group of n college entrance students. On the basis of the 
scores of the n students on the three tests, how could one test the hypothesis 
that the three tests are really parallel forms, as far as means, variances and 
covariances are concerned? In other words, how could one test the hypo- 
thesis that the scores can be regarded as being from a sample of individuals 
from a college entrance population of individuals in which the distribution 
function of the three variables is such that the means of the three variables are 
all equal, the variances are equal and the covariances are equal? Actually, as 
far as practical considerations arc concerned in testing work, it is frequently 
sufficient to consider only normally distributed populations So therefore one 
may raise the question as to how to test the hypothesis that the three -variable 
sample can be considered as having come from a normal three-variable popula- 
tion which is symmetrical in the three variables, i.e. a normal population in 
which the means are equal, the variances are equal, and the covariances are 
equal. Or more generally, one may raise the analogous question, for the case 
of l variables. 

Similarly, one could mention biological examples which have been treated by 
intra-class correlation methods and raise the question as to whether the under- 
lying multivariate distribution can be judged to be symmetric in the variables 
on the basis of information supplied by the sample. 

To attempt to deal with this problem by comparing means, or variances or 
covariances two at a time or performing what might appear to lie extensions of 
existing tests for two or more independent samples of one variable leads to com- 
plications because of correlation among the variables in the original population, 
What is needed is some kind of a comprehensive test which will take into account 
all means, variances and covariances at one time . If it turns out that the hypoth- 
esis of equal means, equal variances and equal covariances is not supported 
by the sample, then one can raise the question as to whether the sample supports 
the hypothesis that the variances are equal and covariances are equal irrespective 
of means. If the answer is yes here, one can ask the further question as to 
whether the sample supports the hypothesis of equal means. Such tests will be 
developed in this paper for samples from a normal multivariate population. 
More specifically three tests are developed, (i) Test L moc for testing the hypoth- 
esis H mvc that all means are equal, all variances are equal and all covariances 
are equal, (ji) test L vc for the hypothesis H vc that all variances are equal and 
all covariances are equal, irrespective of the values of the means, and (iii) test 

1 The problem treated m this paper arose from discussions with Professor Harold 0. 
GuUiksen, of the Psychology Department of Princeton University, in connection with the 
problem of testing whether two or more forms of an examination can be considered as 
“parallel forms’ 1 . The author would like to take this opportunity to acknowledge various 
helpful discussions he has also had with his colleague Professor John W. Tukey m con- 
nection with this paper 
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L m for the hypothesis H m that the means are equal, assuming that H vc is true, 
i.e. that the variances are equal and the covariances equal. 

There aie rather obvious extensions of the hypotheses H mvc , H vc and H m 
■and then corresponding test criteria. For example, one could divide the vari- 
ables in the multivariate population into two sets, and consider the hypothesis 
Hmve (say) , analogous to H' mva , that the means are equal, the variances are equal 
and the covariances are equal within each of the two sets and that the covariances 
of variables between the two sets are all equal. Similarly, and Zf*' could 
be defined so as to be analogous to H v „ and II m . However, these extensions 
will not be considered in this paper 

In Part I of this paper we shall discuss the problem of testing hypotheses 
regarding equality of means, equality of variances, and equality of covariances 
in a normal multivariate population, and summarize the mathematical results 
which have been obtained. An illustrative example will also be given. The 
derivation of the test criteria and their sampling theory is presented m Part II 
of the paper. 

1.2. The hypotheses to be tested. We assume that theie is a fc-vanate 
population II in which the variables Xi , .x 2 , * • • , x*. are distributed according to 
a normal fc-variate piobability density function such that the mean value of 
x t is a, (i = 1, 2, • • , k) and the variance-covariance matrix of Xi , :c 2 , • • • , x k 
is || p,,cr,flq ||, p, , being the correlation coefficient between x ; and x,(i ^ j), and 
o-, being the standard deviation of x . . 

In specifying the hypotheses to be considered it will be convenient to define 
three conditions on the parameters of population II: 

Condition C m : that the means of the x, are all equal. 

Condition C v : that the variances of the x l are all equal. 

Condition C c - that the covariances of the x, and x, (i ?= j) are all equal 

The hypotheses regarding II to be tested are as follows : 

II mvc- that conditions C m , C„ , and C c hold 

Ht! C : that conditions C v and C c hold 

H m : that condition C m holds, assuming that H vc is true. 

A precise statement of these hypotheses in terms of Neyman-Pearson likeli- 
hood ratio terminology will be found in Part II. 

It should be noted that H mvo is a comprehensive hypothesis which specifies 
equality of means, equality of variances and equality of covariances and would 
be tested if one is interested m all of these quantities as a system. On the other 
hand H vc refers only to equality of variances and equality of covariances re- 
gaidless of what values the means may have. H vc would be tested if one is only 
concerned with equality of variances and equality of covariances. H m is a more 
restrictive hypothesis than either H mve or H vc , for it refers to equality of 
means under the assumption that H vc is true, In other words, H m can only be 
tested accurately when H vc is true , H m would be a generalization of the- Behrens- 
Fisher problem [1] when H vc is false. 
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1.3. The sample test criteria. The three hypotheses H m » c , H vn and H m are 
to be tested on the basis of a sample 0„ from TI consisting of the following values 
of the x's: x ta , i = 1, 2, • • ■ , k, a = 1, 2, • • ■ , n. 

The criteria for testing H mve , H vc , and H m depend on the following quantities 
to be determined from the sample : 

(L!) 


(1-2) 

(1.3) 


S if — — ^ ^i) ip'Sa ^|) = Eict&ia ' 


n cc-i 


71 a — 1 


« ! - 2 v , 

K i-1 


s a r = 


Kk - 1) 


2 s»j . 


The sample criteria, baaed on the method of likelihood ratios, for testing 
H mvc , and H m are respectively, as follows: 

(1.4) Lmva — Tto-L^r 1 


(1.5) 

( 1 . 6 ) 


Ud 

- ( S a )*d - f)* -1 (i + (k - m 
^(1 — r ) 

s 2 (l -r) + r— rr 2 (*, - S) 2 

lc — 1 i-1 


where | s, y | is the determinant of sample variances and covariances. 

The range of values of each of the three criteria is from 0 to 1, A necessary 
and sufficient condition for each criterion to have the value 1 is that the hypoth- 
esis for which the criterion is a test be (accidentally) identically supported 
by the sample. If the hypothesis (any one of the three being considered) is 
true, the average value of the corresponding criterion will be less than 1, but 
this average value will be nearer 1 than when the hypothesis is false. 

If H mvc is true (i.e ., found to be supported by the sample on the basis of the 
test L mvc ) then there will be three parameters which characterize II, namely, a 
(the common mean), <r a (the common variance), and p (the common correlation 
coefficient). The best estimates of these three parameters are, respectively: 

1 ^ 

X — - 2 1 
« 1 


(1.7) 



To 



1 

k{k - 1) 



If H„ c is true (i.e , found to be supported by the sample on the basis of the 
test L ve ) there will be k + 2 parameters which characterize n, namely the means 
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(i\ , <i'z , , at , c (the common variance) and p (the common correlation coeffi- 

cient). The best estimates of these parameters are, respectively 

(1.8) xi , Xi , - ■ • , Xu , s ! , and r. 

In order to be able to use the three sample criteria L mvc , L vc and L m for testing 
the hypotheses H mvc , H vc , H m , it is necessary to have then distribution func- 
tions under the assumptions that the respective hypotheses H mvc , H vc and H m 
are true. 


1.4. Sampling theory of the test criteria. The moments of the exact sampling 
distributions of L mvc and L vc when and H ve are true respectively, have been 
determined for all values of fc (number of variables) and all values of n (sample 
size) for which such distributions exist; ie., for fc > 2 and n > k. The <j-th 
moments of the distributions of the two criteria are as follows: 


(1.9) 


M,(L mVc ) = (fc - 1 )•**> 


tt ~ 4 ) ~h 9) 

fi r(i(» - »)) 


r(K* ~ Dn) 

r (K* ~ D(» - i) + oik - i)) 


and 


( 1 . 10 ) 


M a (L ve ) = (k - I)”'*- 15 


k 



r(Kn - i) + g) 

r din - *)) 


r m - 1 )(n - 1)) 
r(i(fc - l)(n - 1) + g(k - 1))' 


For the cases of fc = 2 and fc = 3, these moments simplify so that the distribu- 
tion functions of L mvc and L ue can be readily inferred. They turn out to be as 
follows: 

For fc = 2: 

(1.11) dF(L mvc ) = |(n - 2)(I % , ( ) !( "- 4, dL ft „ 

(1.12) dF{L ve ) = - L -"" 4> ~ L w)“* dL *< • 

V xT(§(n - 2)) 

For fc = 3: 


(1*13) dF(L m „) = 2r( ^ 3) (VU n " (1 - VL mvc y dVZ7" 

dP(M - lf n ^ 1 - V^dVUc 


(1.14) 
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The distribution function of L m when the hypothesis II, n is true has been found 
to be 


(1.15) 


dF(L m ) = 


T(ln(k ~ 1)) 

r(K» - i)(* - i))r(K^ - if) 


£l(i.-l)U-l)-l£| _ 1)— I 


dL m . 


Details of the derh ation of these distribution functions will be found in Part 
II. 

In a paper published elsewhere in the present issue of the Annals of Mathe- 
matical Statistics , Tukey and Wilks [2] show how the probability integrals of 
L mve and L„„ and of other statistical criteria having moments of a rather general 
class can be fitted by Incomplete Beta Functions in such a way that all moments 
of the fitted distribution agree with those of the actual distiibution up to and 

including terms of oidei - . 

n 

It will be noted that the probability integrals of L mvc and L vc for k = 2, those 
of \/L mv „ and \/Lf c for A. = 3, and that of L„ t for any value of k, are Incomplete 
Beta Functions [3], with the following values of p and q: 


k 

criterion 

p 

3 

2 

c 

~ 2 ) 

1 

2 

-Lye 

*(» - 2) 

h 

3 

V I'm VC 

n — 3 

3 

3 

’s/ Lvc 

n — 3 

2 

k 

Lm 

rd 

1 

H 

1 

Wes 

*(* ~ 1) 


Percentage points 2 of the distributions of these criteria for the cases men- 
tioned in this table can therefore be read from Thompson’s [4] tables of per cent 
points for the Incomplete Beta Function 5% and 1% points for L mve and L„„ 
for k = 2 and 3 are given in Table I for certain values of n. Table II shows 
5% and 1% points of L m for certain values of n for k = 2, 3, 4, 5 and 6. 


1.5. The equivalence of L m and an analysis of variance test for a A; by n lay- 
out. One can set up a Snedecor F ratio for testing hypothesis H m by setting 

(1 161 F = *(” Z JWLZ M ~ £») 

i(lfc - DAn 

and entering the. F tables with Hi - k — 1 and n% = (n — 1) ( k — 1) degrees of 

2 The 100if% point, say L« , of a given criterion L (any of thoBe being considered) having 
distribution dF (L) ia given by f ‘ dF(L ) = f 
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TABLE I 


6% and 1% points of L mvc and L vc for k = 2 and k = 3 




k = 

2 


k = 3 

n 

Lmvc 

L 

VO 

n 

Lm ie 

L 

c 

5% 

1% 

5% 

1% 

5% 

1% 

5% 

1 % 

3 

0 0025 

.0001 

0.0062 

.0002 

4 

0.00029 

0.00001 

0 00064 

0 00003 

4 

0500 

0100 

.0975 

0199 

5 

.0095 

.0018 

0183 

.0035 

5 

.1357 

.0464 

.2285 

0808 

6 

.0358 

.0112 

0618 

.0198 

6 

.2236 

.1000 

.3416 

.1588 

7 

0736 

0300 

.1174 

0493 

7 

3017 

.1585 

4307 

.2352 

8 

.1165 

.0559 

.1749 

.0866 

8 

.3684 

2154 

.5005 

.3039 

9 

1603 

0860 

2297 

.1272 

9 

.4249 

.2683 

5559 

.3637 

10 

.2028 

1181 

2802 

.1682 

10 

.4729 

.3162 

.6007 

4154 

11 

.2432 

.1508 

.3259 

.2079 

11 

5139 

.3594 

6375 

4601 

12 

.2808 

.1829 

3670 

2457 

12 

.5493 

.3981 

6682 

,4989 

13 

3157 

.2141 

.4040 

.2811 

13 

.5800 

4329 

.6943 

.5328 

14 

3480 

2439 

.4373 

.3141 

14 

6070 

.4642 

7165 

.5626 

15 

.3778 

.2722 

.4674 

3448 

15 

6307 

4924 

7358 

.5889 

16 

4052 

.2990 

.4946 

3732 

16 

6518 

5180 

.7528 

6124 

17 

.4306 

.3243 

.5193 

.3996 

17 

.6707 

5411 

7675 

6334 

18 

.4540 

.3482 

.5418 

.4240 

18 

,6877 

.5623 

.7807 

.6522 

23 

.5484 

.4482 

.6293 

.5230 

19 

.7030 

.5817 

7925 

.6693 

33 

.6660 

.5811 

7326 

.6470 

20 

.7169 

.5995 

.8031 

.6848 

63 

.8135 

.7591 

.8549 

.8029 

21 

.7294 

.6159 

8126 

6989 

00 

1.0000 

1 0000 

1 0000 

1.0000 

22 

.7411 

.6310 

.8213 

.7119 






23 

.7518 

.6450 

8292 

7237 






24 

7616 

.6579 

8365 

7347 






25 

7707 

.6700 

8431 

.7448 






26 

7791 

.6813 

8493 

.7542 






27 

7869 

.6918 

8549 

7629 






28 

.7942 

.7017 

.8602 

7710 






29 

.8010 

.7110 

.8651 

7786 






30 

8074 

7197 

8697 

.7857 






31 

.8133 

.7279 

.8739 

.7924 






32 

8190 

7356 

8779 

7987 






42 

8609 

7943 

.9073 

.8454 






62 

.9050 

.8577 

9375 

.8945 






122 

.9513 

.9261 

9684 

9460 






OQ 

1.0000 

1.0000 

1 0000 

1 0000 
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TABLE II 

5% and 1% points of L m 
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freedom. Making use of the definition of s 2 , , r and ro in L m , one finds that F 

can be written as 


(1.17) 


p = — — — f — 

(fc - 1) / (n - 


&2 


l)(fc - 1) 


k n k 

where & = n 52 (*» — t) 2 , and S 2 = 52 22 (x, a — x a - x, + t) 2 and 

t— -1 (I«l 

1 ft 

= - 52 2 ,, . Thus, the use of L m as a criterion for testing H m is equivalent 

k a— 1 

to an analysis of variance test for testing “row” effects in a fc by n rectangular 
layout when rows are associated with the fc variables in the multivariate popula- 
tion and columns are associated with the n individuals in the sample. 


1.6. Approximate sampling theory of the test criteria for large samples. In 1 

the case of large samples, it follows from a theorem [5] concerning the distribution 
of likelihood ratio criteria for large samples that — nhiL mvc , — n In L ve , and 
—n(k — 1) In L m are approximately distributed according to chi-square distribu- 
tions with £fc(fc + 3) — 3, %k(k + 1) — 2, and fc — 1 degrees of freedom respec- 
tively. Approximate 5% and 1% points of these three quantities taken from 
Thompson’s [6] tables of the percentage points of the chi-square distribution 
are given in Table III 

Table IV is given in order to furnish some idea of how the accuracy of the 
approximations provided by Table III depend on n. It will be noted that the 
approximate values exceed the exact values in every case, differences occuring 
in the third decimal place in almost every case in which n exceeds 60. The ap- 
proximate percentages to which the approximate per cent points correspond 
are given by the numbers in the parentheses in Table IV These numbers in 
each case were obtained by linear interpolation from the exact 5% and 1% 
points 


1.7. Comparison of L„ c with Mauchly’s “sphericity” test. The criterion 
L va for testing hypothesis H vc is, in a sense, an extension of a test developed by 
Mauchly [7] for testing the hypothesis of “sphericity” of a normal multivariate 
distribution Mauchly’s test was designed for testing the hypothesis that all 
variances are equal, and that all covariances are equal to zero irrespective of the 
values of the population means The likelihood criterion for testing this hypoth- 
esis of “sphericity” is 


(1.18) 


L, = 


(s 2 )* 


which should be compared with L vc . Actually, Mauchly used •%/ L. as the test 
criterion, which, of course, is equivalent to using L. . The g-th moment of L, 
when the hypothesis of sphericity is true is given by 


(1.19) 


k° k 


k r t 

■n 1 

*-i L 


r(i(« -i) + 0)1 r (**(n - l)) 


r(i(n - *')) J r(P(n - 1) + gk) 
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TABLE III 


Approximate 5% and 1% pom to for — n In Lmvn h In 1j vc , and n(k 1 ) In L m 

for k = 2, B, 4, 5, 6. 


k 


— n In L 

m re 


— nln l 

p 0 


- n, (fc — 1) In L„ 

d.f. 

5% 

1% 

d.f. 

5% 

5% 

df. 

5% 

1% 

2 

2 

5.99147 


1 

3.84140 


1 

3.84140 

0.63490 

3 

6 

12 5916 

16.8119 

4 


13.2707 

2 

5.99147 

9.21034 

4 

11 



8 



3 

7.81473 

11.3449 

5 

17 

27.5871 

ia 

13 

22.3621 


4 

9.48773 

13.2767 

6 

24 

36.4151 

42.9798 

19 



5 

11.0705 

15.0863 


TABLE IV 


Table indicating the accuracy of the approximate 5% and 1% points of L mvc , L„ c 
and L m provided by Table III 


criterion 

k 

71 

5% 

1% 

exact 

approx. 

exact 

approx. 

Ljnve 

2 

30 

0.8074 

0.8190 (5.53)* 

0.7197 

0.7357 (1.73)* 

Lrnpo 

2 

62 

.9050 

.9079 (5.25) 

.8577 

.8619 (1.36) 

Lmv c 

2 

122 

.9513 

.9621 (5.13) 

.9261 

.9273 (1.19) 

Lmvc 

3 

33 

.6660 

.6828 (5.79) 

.5811 

.6008 (1.88) 

Lmvo 

3 

63 

.8135 

.8188 (5.40) 

.7591 

.7658 (1.49) 

L vc 

2 

30 

.8697 

.8799 (5.49) 

.7857 

.8016 (1.76) 

Lp C 

2 

62 

9375 

.9399 (5.22) 

.8945 

.8985 (1.37) 

Lye 

2 

122 

.9684 

.9690 (5.11) 

.9460 

.9471 (1.20) 

LyC 

3 

33 

.7326 

.7501 (5.82) 

.6470 

.6688 (2.01) 

Li* 

3 

63 

.8549 

.8602 (5.41) 

.8029 

.8100 (1.55) 

L n 

2 

31 

.8779 

.8835 (5.28) 

.7987 

.8073 (1.43) 

L m 

2 

61 

.9375 

.9389 (5.13) 

.8945 

.8969 (1.20) 

Lt rt 

2 

121 

.9684 

.9688 (5.07) 

.9460 

.9467 (1.13) 

L» 

3 

31 

.9050 

.9079 (5.25) 

.8577 

.8619 (1.36) 

1* 

3 

61 

.9513 

.9521 (5.10) 

.9261 

.9273 (1.14) 

L a 

4 

41 

9372 

.9385 (5.19) 

.9101 

.9119 (1.26) 

L m 

5 

31 

.9246 

.9264 (5.25) 

.8901 

.8984 (1.32) 


*The numbers in the parentheses are approximate percentages (obtained by linoar 
interpolation) to which the approximate percent points correspond. 


which should be compared with the 57 -th moment of L vo . Stated in other words, 
Mauchly’s criterion L, is a test for the hypothesis that contours of equal proba- 
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bility density m the multivariate normal population distribution are spheres, 
while L vc is a test for the hypothesis that the contours of equal probability are 
fc-dimensional ellipsoids with k — 1 equal axes in general shorter than the k-th 
axis which is equally inclined to the k coordinate axes of the distribution 
function. 

1.8. Illustrative Example. As an example to illustrate the use of the test 
criteria L mvc , L ve , L m , we shall consider data on three forms of a subtest in 
verbal aptitude, and inquire as to whether the data are consistent with the 
hypothesis of the three forms being “parallel forms”. 

A procedure 3 was used for partitioning the first 60 of an entire test of 80 items 
into three sets of 20 items each by using only a "difficulty ’ and a “validity” 
index on each of the items. A random sample of 100 test booklets was selected 
from those in which the first 60 items had been attempted. Total scores were 
obtained on each of the three subtests selected in this manner. The question 
is this: Does this procedure of selecting items produce “parallel” subtests? 
In other words considering the three scores on the three subtests in each of the 
100 test booklets as a sample of 100 items from a trivariate normal population 
is the sample consistent with the hypothesis H mvc of equal means, equal variances 
and equal covariances? If not, is the sample consistent with the hypothesis H vc 
of equal variances and equal covariances irrespective of means? If the answer 
to this question is no, then the failure of the tests to be parallel is at least partially 
attributable to differences in variances and/or differences in covariances. If 
the answer to the question is yes, we test H m , the hypothesis of equal means, 
assuming equal variances and equal covariances. If the sample is not consistent 
with H m , then the subtests fail to be parallel because of significant differences in 
means. 

If we denote the three subtests by Ti , Tt , Ts , and the scores on the a-th 
individual in the sample on the three tests by x la , x ia , x 3a respectively (a = 
1, 2, • • , 100), the information in the sample needed for computing L mvC) 

L vc and L m and testing H mvc , H vo and H m is as follows: 

ft = 10 9900 s 2 = 17.5558 

ft = 10 9300 so = 17.5764 

ft = 11 2600 r = 7963 

sn = 16.8451 r 0 = -7948 

s S a = 18.1099 | s,j | = 545.5308 

Sjs = 17.7124 

S 12 = 13.5493 

8 U = 14.5826 

52 3 = 13 8056 

3 Devised by Mr L . R Tucker of the College Entrance Examination Board. The author 
is indebted to Mr Tucker for the data used in the illustrative example. 
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Using formulas (1.4), (1.5), and (1.6), for k = 3, for calculating the values of 
L mvc , L ve and L m , we find 

Lmvc = .9209 
L, c = .9370 
L„ = .9914 

It will be seen from Table III that the 5 % point of — n In L m „ e for 
k = 3 is 12.5912. Setting — 100 In L mve = 12.5912 and solving we find the 
approximate 5% point of L mve to be ,8817 which is considerably less than the 
observed value of L m , c , namely .9209. Hence, the sample is consistent with 
II mvc . As a matter of fact the observed value .9209 lies at approximately 
the 25% point of L mtic ■ 

In practice, there would be no point in proceeding to test II vc or II m , because 
if L mvc is non-significant there is a high probability (not certainty) that both L„ 0 
and L m will be non-significant. But for illustrative purposes, it ia perhaps useful 
to consider L„ c and L m anyway. 

The 5% point of —n lrt L„ 0 for k = 3 is 9.48773 (See Table III), Setting 
— 100 In L vc = 9.48773 and solving, we get .9095 as the approximate 5% point 
of L vc , which is considerably less than the observed value .9370, thus indicating 
that L vc is not significant at the 5% level. In fact the observed value .9370 
lies between the 25% and 10% point of . 

The 5% point of — n(k — 1) In L m for k = 3 is 5.99147. Setting — 200 In L n = 
5.99147 and solving we get .9704 as the approximate 5% point. Since the ob- 
served value of L n exceeds .9704, we find L m not significant at the 5% level. In 
fact, the observed value .9914 lies between the 50% and 25% points. 

II. Derivation of Results 

In this part we shall derive the criteria L m , 0 , L vc and L m for testing H,„, c , 
Hvc and II m by the Neyman-Pearson method of likelihood ratios, and determine 
the distribution theory of the criteria. 

2.1. The test L mva for II mvc , the hypothesis of equality of means, equality of 
variances and equality of covariances. 

2.1.1 Derivation of the criterion Z/ m „„ . Let n be a normal fc-variate population, 
in which xi , Xi , ■ ■ • , z h are variables, such that a, is the mean of xi , a] the vari- 
ance of x, and pij<r x <r,- the covariance (p,, 1 the correlation coefficient) between 
x, and Xj The distribution law of , xi , • • ■ , x* in the population, is 

ex P [“i Z i Ai,{xi - o .) (xj - a f ) J 

where || A,-; || is symmetric and is the inverse of the variance-covariance matrix, 
i,e. = II /Will, (p,, = 1).- 

Now suppose On is a random sample of n individuals from population II, 
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and let x, a be the value of the x, for the ath individual in the sample, 
the probability function for the entire sample (likelihood function) is 


( 2 . 2 ) 


P = exp ~ a '^ x >' 



Then, 


The hypothesis which we wish to test is that the population means ai , 
, • • • , <z* are equal, the variances al , a\ , • • • , a\ are all equal and the covari- 
ances pn<Ti<ri , P130TC3 , ■ • ■ , pi-1, iffi-unt are all equal, the test to be made on the 
basis of the sample of values x, a . In other words, we wish to test the hypothe- 
sis that 
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Testing the hypothesis that (2.3) holds is equivalent to testing the hypothesis 
that 


(2.4) 


where 


Oi = 

Oj = 


ak — a 




An 

A12 ‘ • 

■ Aik 


A 

B •• 

■ B 

An 

An 

■ 


B 

A 

■ B 

An 


Akk 


B 

B 

■ A 


(2.5) 


1 + (fc - 2)p 
* (1 - p)(l + (fc — l)p) ’ 


B = 




a\l - p)(l + (ft - l)p)‘ 


To obtain the likelihood criterion L mvc for testing the hypothesis H mvc we 
maximize the likelihood (2,2) under two conditions, for the given sample 0„ , 
and take the ratio of the two resulting maxima. Fust, we maximize (2.2) over 
the set Q of admissible values of the parameters, i e with respect to all means 
Oi and all variances and covariances p l] a l a , , denoting the resulting maximum 
of (2.2) by P n . Secondly, we maximize (2.2) over the set of values « of the 
parameters which satisfy the hypothesis H mvc ; that is, we replace in (2.2) each 
mean 01 , 02 , ■ ■ ■ , a k by a, and each of the variances a \ , a \ , • ■ • , a\ by a and 
each of the covariances p^a x a , , (1 ^ j ), by pa and then maximize (2.2) with re- 
spect to a, a , and p, denoting the resulting maximum by P u . 
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Maximizing (2,2) under the first set of conditions is equivalent to maximizing 
it with respect to the a, , and the A t} , while maximizing (2,2) under the second 
set of conditions is equivalent to imposing condition (2.4) and maximizing it 
with respect to a, A and B. 

The valued of the a, and A , , which maximize (2,2) under the first set of condi- 
tions are given by solving the following (fc* + 3fc)/2 equations. 


( 2 . 6 ) 

(2,7) 


dP 

da t 


dP 

dA., 


= 0, 


0, i = 1, 2, ■ • • , k 
*.5=1,2, , k, (i < 3). 


Expressions for these equations are 

(2.8) nJXifo ~ «,)] P = 0, 


(2.9) 


^ A 1 ’ - £X(x», - a,)(x, a - a/) P = 

a»»l 


i — 1,2, • ■ ■ , A 
0, i,j, = 1,2, - 


, k,(i < j), 


where A' 1 is the element in the ith row and jth column of || An || \ i.e. 

1, y \ 

A' 1 = pi 3 <r t <r,, and &, = - X^ja 

71 cr «=1 

The solution of (2.8) and (2,9) is 

( 2 . 10 ) 5 = 1,2 

A u = s tj , or Ai, = »**, i, j = l, 2, • ■ • , k, (i < j) 


1 n 

where = - X(£.« — Xi){x, a — ij), and where || s <} || = || s.vH. -1 . In- 

71 ami 

serting the values of (2,10) in (2.2) and noting that the exponent in (2.2) re- 

n ‘ , k 

duces to — - 2-i s' ; s, 3 , which in turn reduces to — %kn, since X s^s,, ~ 1 

,.j-i ,-i 

for each value of j, we obtain 


„~ikn 

(211) Po=, ■ 

| s t j | ,n ( 27 r) 1 *" 

In order to obtain P u , we specialize the a, and the matrix || An || in (2.2) 
in accordance with (2,4), noting that the determinant | An | reduces to 
(d — B) ’(A + (k — 1)5), thus obtaining the following specialized form 
of (2,2) 

(2,12) p> _ KA ~ + (k- l)B)] in 

(2x) infc 

exp {-{ [aX X (ft. - a) 1 + B X X (ft. - o)(* f . - o)Ti. 

\ l— n=l i*=l £*=X i^jwl _JJ 
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The values of a, A and B which maximize P' are given by solving the following 
three equations 

< 2 - ]3 > S-°- 

These equations are respectively 

ft <4 - (*.« - «) + BE (E (*.. ~ a))] P' = 0 

L a = l *-l “- 1 V'- 1 / J 


(2.14) 


J iv.(k 


B ' A + (k - 1)B 


-ii.il (*,« - a) 2 V = o 

a=*l *=1 _ 


+ ZTT^I)5 - 5 ~ •>] ^ 

1 ^ 

Replacing a, by x, in (2.15) putting - E (£ta — Xi)(x )a — £j) = Su, and 

Tl a-1 

setting 

” ■< n fe 

i = -r Z E 

n/C a-l »-l 
1 n 

s 0 v, = - E ( x >* - x )( x )« - x) = s„ + (x, - x)(x, - x) 
n a-l 

Jfc k 

(2.15) r 0 = E s 0l ,/(/c - 1) E sot! 

irO-l i-i 

= [" E s,j — E ( 2 . - x) 2 l / ( fc - i) Te s». + E ($» - z) 2 l 

Ltjrfj— i i— x J / L*™! J 

So = E Sot t/k = j[Es.. + E(*>- *)’] 

k i-l K Li-1 1-1 J 

we obtain as solutions of (2.14) 


1 -f~ {k — 2) r 0 

so(l - r)(l + (fc - l)r 0 ) 


B = 


*5(1 ~ *>)(! + (fc - l)ro) 


Substituting these in (2.12) we obtain 


( 2 . 17 ) 




[(s§) fc (l - ro) fc_I (1 +(&-l)ro)] ln (2 1 r) lfen - 
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The likelihood ratio \ nve for testing hypothesis H mvc is given by 

Pu 
3 q 




It will be convenient to use the - th root of \ mvc as the test criterion for fl_„, 

n 


Denoting this criterion by L mve , we have 


( 2 . 18 ) 


Lmva 




(«5) fc (l - r 0 )* -i (l + (fc - l)r 0 ) ' 




The use of L mvc as a test criterion is obviously equivalent to the use of A mvo . 

It will be seen that L mic is equal to unity when and only when the sample 
means a;, are all equal, the sample variances 8,,- are all equal and when the 
sample covariances s xJ , ( i ^ j), are all equal. The greater the departure of 
sample means from equality, sample variances from equality and Bample co- 
variances from equality, the smaller will be the value of L mvc , its value, of course, 
always remaining between 0 and 1, 

2.1.2. Approximate distribution of —n In L nve in large samples. In order to 
make use of L mve as a criterion for testing hypothesis H mvc wo must find its 
sampling distribution under the assumption that H mvc is true, i.e. that our sample 
has, in fact, been drawn from a fc-variate normal population having equal means, 
equal variances and equal covariances. In the case of large samples, it follows 
from a theorem on asymptotic distributions of likelihood ratios [5] that — 2lnA mi „, 
(i.e, —n In L mvc ) is approximately distributed according to the chi-square law 
with \kik + 3) — 3 degrees of freedom (obtained by taking the difference ber 
tween the number of parameters used in maximizing P to obtain Pa and that 
used in maximizing P' to obtain P u ) . 

Thus, to apply the test, one computes the value of — n In L mve for the given 
sample, and sees whether the obtained value is significant at the given probability 
level (5% or 1%) using the chi-square table for \k(k + 3) — 3 degrees of freedom. 

To make a study of how closely the chi-square distribution approximates the 
exact distribution of —a In L mvc for various values of fc and n would be an ard- 
uous task in computation. But existing experience with approximations to large 
sample distributions indicates that the approximation in the present problem 
would be satisfactory for small values of k (say not more than 5) and values 
of n not less than 50 Some light is thrown on this question for k = 2 and 3 
by Table IV 

2,1 3. Moments of the exact distribution of L mvo . In Section 2.1.2 an approxi- 
mation is given to the distribution of — n In. L mv „ for large samples. As a matter 
of fact, one can find expressions for the moments of the exact distribution of 
Lmvc , which for the cases of k = -2 and k = 3 yield simple expressions for the 
exact distribution of L mv , 
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To find the moments of L mvc it will be noted that if one sets 


TlS^j — Ctj 


TISqi f (Zcij 

in expression (2.18) for L mvc , the following expression is obtained for L mva . 

<2 - I9) - [M] 

where 


( 2 . 20 ) 


ft = iEn 0l. - TjT TV 2 

h 1 k(k - 1 ) ifjLi 


Ooi i 




floo + 



It will be seen that L mve depends on the x, and the o,/ . In the case of a sample 
from a general normal multivariate population, we know the a,/ to be distributed 
according to the Wishart [8] distribution function 


(2.21) W n - llk (a i{ ; A t ,) 




i“-” i ]•<-—»> ± 4li J 

L i.J-l J 

i-i 


and the means sc, to be independently distributed according to the normal dis- 
tribution 



where the A t j and a, were defined in (2.1). 

We now define a function <p(g, u, v) as the mean value of | a {j \° e uK ° +cs ° when 
H mv c is true, i.e , 

(2-23) <p{g, u, v) = B(| a„ |V B » +ia «) 


where the right hand side denotes multiplication of (2.21) by (2.22) (after im- 
posing condition (2 4)) by | a,, j 5 e uB| > +vS » an d then integration with respect to 
the as,, and 5, . This yields 


<p(g, v) 

(2.24j 


_ os* TT [" r (^( n ~ *) + g) "l 
f-iL r(Kn - i)) J 

^ (A - £) in '*- l) (A + (k - 1 )B) Un ~ 1) 

A 77 I 2u \*C*-l)(n+e) • 

[A - B - J {A + (fc - 1 )B - 2r)it' 1 - 1 '^ 
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Now the gth moment M„(L m „ c ) of L mve is defined by 
(2.25) M 0 {L mvc = E[(L moc y ] 

and is obtained by evaluating the partial derivative 


(2.26) 




g u r ( k - l ) dt )' ^ 


at u = v = 0, and then putting r = —g and a = —g. The validity of this 
operation for the range of values of g in which we are interested can lie estab- 
lished by an argument based on analytic continuation. Alternatively, the same 
result can be achieved by taking the indefinite integral of <p r(k — 1) times suc- 
cessively with respect to n, and a times successively with respect to v (the lower 
limit of integration being — « in every case) and then evaluating the final 
result at u = v = 0. Accordingly, we obtain for the gth moment of L mvt , 
when hypothesis H mvc is true, the following expression 


(2.27) 


Ma(L ) = frf r ^( n ~ *’) + g) l 

ei "‘“ o) iiL r(|(»-i)) J 

x (k _ , y u-i> r(*(n ~ l)r(|n(fe - 1)) 

r(i(» - D + g) r(*n(fc - 1) + g{k ~ 1)) • 


2.1.4. Distribution of L„ vc for k — 2 and 3. For k — 2, the criterion Z/ mi0 
can be expressed as 




fill 5 12 




S21 fljj 


“BIBO 

i( 8 u + Bn) + — £ 2 ) 2 

fin “ i(#i — it) 1 


fin — K& — £2) 2 

i(*u 4 - S22) + !(#i — &) 2 


The pth moment of L mve for k = 2 (obtained by putting k = 2 in (2.26) is 


(2.29) 


^^o(Dmva) — 


r(jn)rft(n - 2) + g) 

r(*n + 0 )r(Kn - 2 )) 


(Un - 2)) 

(1 (n - 2) + g) ’ 


and the distribution function of L mvc is found to be 


(2.30) dF(L nvc ) = *(» - 2)L‘ 1 ( ,r 4) (0 < < 1), 

For k = 3, L m „ 0 can be written as 


sn 

S12 

fin 

811 

S22 

823 

831 

SS2 

S33 


(so’)’(l - r») 2 (l + 2r 0 ) 


(2.31) 
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■where si and r D are defined in (2 15) for fc = 3. Putting fc = 3 in (2.26) we 
find the gth moment of L mve for this case to be 

, 0 ™ n n _ r (K« - 2) + <7)r(K» - 3) + g)v(n) 

(2 32) 2 r(£(n — 2)r( A(n — 3)r( n + 2g ) " ' 

By using the fact that 

r« + f)r(* + i) = y 'l L , 

it is seen that M a (L mvc ) reduces to 

(2 331 M (Z/ ) = — 3 + 2g) 

A6) r(» + 2ff)r(n - 3) ’ 

from which we deduce the distribution of L mve to be 


(2 34) 


r(3)r(n- ~ 3) (v/L »“)”' 4 ( 1 - vXJWL™ , 

(0 < L m « < 1). 


For values of it > 3, the exact distribution of L mve seems to be too complicated 
to lend itself to ready computation. 

Thus, relatively simple exact tests of significance of L mvc can be set up for 
k = 2 and fc = 3 by using distribution functions (2.30) and (2 34) respectively. 
For large values of n we have pointed out that the significance of L mvc can be 
tested by making use of the fact that — n In L m , c is approximately distributed 
according to a chi-square law with Jfc(fc + 3) — 3 degrees of freedom when H mvc 
is tiue 

Bor fc = 2, L mve is essentially a criterion for simultaneously testing, on the 
basis of a sample, the hypothesis of equality of means and equality of variances 
of a normal bivariate population 

It should be noted that if H mvc is true, or more realistically, is supported by 
the sample as a result of applying test L mvc , then population II is characterized 
by the three parameters a, <f and p in (2.3) . The likelihood estimates of these 
parameters are x, So and r 0 . 


2.2. The test L vc for H ve , the hypothesis of equality of variances and equal- 
ity of covariances, irrespective of the values of the means. 

2.2 1 Derivation of the criterion L ve . If, in testing hypothesis H mvo by means 
of the criterion L mvc , at a given level of significance, say «, a non-significant value 
of L mvc is obtained, one states that the sample is consistent with the hypothesis 
H mvc that all the population means are equal, the variances aie equal and the 
covariances are equal. Consideration of the Neyman-Pearson Type II error 
mvolved in this statement would be very arduous and involved and will not be 
attempted. On the other hand, if a significant value of L mvc is obtained, one 
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states that the sample contradicts the hypothesis H muo with probability « of 
making a Neyman-Pearson Type I error. In this case it may be reasonable to 
inquire whether the sample would support the hypothesis if the variability 
due to the mqans were eliminated. In other words, we may inquire whether the 
sample supports the hypothesis H ci of equal variances and equal covariances, 
irrespective of what values the population means may have. To obtain the 
likelihood ratio criterion L vc for testing we maximize the likelihood (2.2) 
under the following two sets of conditions: First, with respect to the means a< 
and the variances and covariances ; and Secondly, with respect to the 

means a; and A and B, where A and B are obtained by imposing the condition 
on the matrix || An || specified in (2 14). The maximum of (2.2) under the first 
condition is given by (2.11). Denoting the maximum of (2.2) under the second 
set of conditions by , it is found, by a procedure similar to that used in finding 
P u (given by (2.17), that P u > is given by 


(2.35) 


where 


(2.36) 


—Jin 


'[(«*)*(! ~ + (* - l)r)]*"(2irj‘ r » 

r = X fiq / (* “ 1) X «•< 

W-i / i - i 

s 2 = X su / k 

i-i / 


The likelihood ratio X„ for testing H, e is given by 


-[ 


I 8 */ 1 


(<*)*(! - r)* -1 (l + (k 


T*. 

- iwJ 


The test criterion which will be used for testing H vc is L ve , the -th root of 

71 

Xnc, i o., 


(2.37) 


L v e — 


s./i 


(s 5 )*(l — r)* -1 (l + (fc — l)f) ’ 


2.2.2. Approximate distribution of —n In L vo in large samples. 

In 'the case of large samples — n In L vo is approximately distributed according 
to the chi-square law with \k(k + 1) — 2 degrees of freedom when hypothesis H, c 
is true. 

2.2.3. Moments of the exact distribution 'of L vo . The moments of L ve when 
Hvc is true can be found by a method similar to that used in Section 2.1.3 for 
determining the moments of L m ,„ . For it will be noted that L te can be written as 


(2.38) 


L r i «./ 1 "| 

L " lE^sj 



SAMPLE CRITERIA 


277 


where 


1 ' 1 ^ , 

R = k § a “ " k{k - 1) ,|li a,i 

(2.39) if * * n 

s - fc|_5 ai< + J?-i a,, \' 

from which it is evident that L te depends only op the a t j , whose distribution in 
the case of a general normal multivariate population is given by (2.21). We 
now define a function 0(g, y, z) as the mean value of | |V' H +,s under the as- 
sumption that H ve is true, i.e., 

(2.40) 6(g,y,z) = E(\ aiJ \Y R+ta ) 

where the value of the right hand side is obtained by multiplying (2.21) by 
I a., |V R+lS , then imposing the condition on || A , y || stated in (2.4) and integrat- 
ing with respect to the a tj . Accordingly, we find 

(2.41) (A — + (k- 1 )fl)“ B-1) 

x / o,, 

\A — B — (A + (ft - 1 )B - 2 z) Un ~ 1)+ ‘ 

The ffth moment M„(L m ) of L„« is obtained by evaluating the partial derivative 

ar(t— 1)+» 

at y = z — 0, and then setting, r = —g and s = —g. These operations yield 


M C (L. C ) 


-s[ 


r(K» - i) + g) ~| 

r*(« - *) J 


x (fc - 1) 


.«w) r(i(» - i))r(j(fc - 1)(» - l)) 


as follows: 


(2.44) 


L vo — 


r»(n - 1) 

+g) r»(fc- 

• k — 2 and 3. 

For k = 2 
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and the gth moment of L vc is given by 
(2.45) 


M„(L„ C ) = i))r(}(n - 2) + g) 


r (Hn - 1) + g) r(i(n - 2)) 
from which the distribution of L vo is deduced to be 


( 2 .«, 

For k = 3, L vc can be expressed as 


dL 


■jvc , (0 < L„„ < 1), 


(2.47) 


Lvc — 


®U Sis Sl 3 

Sai s 22 Su 

®ai Sh s 3 i 


, s , mi - r) 2 (l + 2r) 

where a and r are defined in (2.36) by setting k = 3. Setting fc = 3 in (2.43) 
we find as the pth moment of L„ c 

(2 48) M 0 (L VC ) = 2 2 " - 2) + ff)rft(n - 3) + g)Y{n - JL) 

r(*(» - 2))r(*(n - 3))r(» - 1 + 2 g) ' 

Following the method by which (2.32) was reduced to (2.33) , we find that the ath 
moment of L ve reduces to 

(2-49) M 0 (L VC ) = r(n ~ 1 ) r ( n ~ 3 + 2 g) 

T(n - 1 + 2p)r(n - 3) ’ 

and hence the distribution function of L vc for k = 3 is 

(2.50) dF(Lw ) = _ g-J (VZr.) B “‘( 1 - d\/L7 0 , (0 < L„ < 1). 

For higher values of k the distribution of L„ 0 is apparently too complicated for 
ready computation. But distributions (2.46) and (2.50) provide relatively 
simple significance tests for the cases k = 2 and 3, respectively. For large sam- 
peB we remark again that a significance test for L ve is provided by the fact 

, vc ’ ~ n ln L -) 1B approximately distributed according to the chi- 
square law with $k(k + 1) - 2 degrees of freedom when H v . is true. 

th . f ■ * 18 eSS , entially a uriterion for testing! on the basis of a sample, 

hypothesis of equality of variances of a normal bivariate population. 

and t! ’ 11 Wm ^ ?. haracterize d by the parameters m , 02 , - ■ ■ , a h , <r* 
- p 3 ^ he maximum likelihood estimates of these parameters are , x a . ■ • • 
Xk , s and r, respectively. 

variances hypothesis of equality of means, when the 
variances are equal and covariances are equal. 
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2 3.1. Devwation of the criterion L m . Suppose L vc , described in Section 2.2.1 
for testing H vc , the hypothesis of equal variances and equal covariances, does 
not have a significantly small value, thus indicating that the sample does not 
contradict the hypothesis H vc . Then, assuming that the original test L mv „ 
of H mvc turned out to have a significantly small value, we may inquire as to 
whether the significance of L mvc is due to the inequality of the population means 
o, , In this section we shall consider a criterion L m for testing the hypothesis 
H m that the means a, are equal, assuming that the variances are equal and that 
the covariances are equal. In this hypothesis we maximize the likelihood (2.2) 
under the following two sets of conditions : First, with respect to the , A and B, 
where A and B are defined by the condition on || || given in (2.4); secondly, 

with respect to a, A and B where these parameters are specified by (2.4). The 
maxima of the likelihood (2.2) under these two conditions are P u - , and , 
given by (2.35) and (2.17) respectively. The likelihood ratio X m is therefore 


(2.51) 


PjL - f («V(1 - ^-'(l + (k - l)r) 

P*' L(«S)*(1 - ro) k_1 (l + (k - l)r 0 )_ 


Now it follows from the definitions of «*, , So and r 0 , (2,15) and (2.36) that 


s 2 (l + (* - l)r) s ai(l + (fc - l)ro) 


and hence we may write 
(2.52) x! /n 


We can also express X„ n as 



(2.53) 



where Ra and R are defined by (2.20) and (2.39) respectively. 

It will be most convenient for our purposes to use L m , the [2/n(k — l)]-th 
root of X m . as the criterion for testing H m , i.e. 


(2.54) 


L m = R/Ra = 


s z (l — r) 

4{i ~ r «) 


s 2 (l — r) 

s s (l - r) + t £ (x, - *) 2 

1C i t«l 


2.3 2. Approximate distribution of —n(k — 1) In L n in large samples. 

In large samples — 2 ln X m (i.e., —n(k — 1) In L m ) is approximately distributed 
according to the chi-square law with k — 1 degrees of freedom. However, 
the exact distribution of L„ is relatively simple and will be derived. 

2.3.3. Exact distribution of L m when H m is true. We shall determine the dis- 
tribution of L m by first finding the gth moment of L„ when H m is true. For this 
purpose we set up the function 

(2.55) *(p, q) = E(e pR+QRt ) 
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where the mean value is taken when ll m 1b true, i.e., when the a,- and || 4 tJ )| 
satisfactory conditions (2.4). Now R and Rq are functions of the o,y and £i , 
Hence, to find F(e pR+<,It °) we multiply (2,21) by (2.22) by e pH_f ' aKc ' and impose 
conditions (2,4), then take the integral over the entire space of the o,/ and x { . 
These operations yield 

( m) *<»■ 5) = r _ B _ 2^>y J2 - a - aJ 15 ' 


The gth moment of L m is obtained by performing the following differentiations 


(2.57) 



and then putting h = —g. These operations yield 


( 2 . 68 ) 


M (I , m mn ~ 1)(* ~ 1) + g)T(fr(fc ~ D) 
^ T(^(n - 1)(* - l))r(*n(fe ~ 1) +p) 


from which the distribution of L m (when H m is true) is found to be 


dF(L m )~ TMlzJV Ll ( "" 

(2.59) r(t(tt-i)(Jb-i))r(j(*-i)) 

•(1 - L n f k ~ l) - 1 dL m , (0 < < 1). 

Thus, we are able to make an exact test of significance of L m on the basis of 
the function (2.59) 


2.4. Relations between L mvc , L„ e and L m . 

It will be seen from the definitions of L mvc , L,„ and L m in (2.18), (2.37) and 
(2.54) (noting that s s (I + (k ~ 1)7-) = sj(l + (fc — l)r 0 )) that 

L mve = iwLjr 1 . 

Furthermore, it will be noted that when H„ vc is true, the grth moment of L mvc 
given by (2.27) is equal to the product of the pth moment of L vc given by (2,43) 
and the pth moment of L k n X (obtained by replacing g by g{k — 1) in (2.58). 
Thus, when H mv0 is true X mco is composed of the product of two independently 
distributed quantities, namely L vc and Z£T l . 

REFERENCES 

[1] H. ScHEPrfi, “A Note on the Behrens-Fisher Problem,” Annals, of Math, Stat., Vol. 16 

(1944), pp. 430-432. 

[2] John W. Tuxey and S. S Wilks, “Approximation of the distribution of the product of 

beta variables by a single beta variable,” Annals of Math, Slat., Vol, 17 (1948), 
pp. 318-324. 

[3] K. Peabson , Tables of the Incomplete Beta Function, Cambridge University Press, 1932. 



SAMPLE CRITERIA 


281 


[4] Catherine M. Thompson, “Table of percentage pointB of the Incomplete Beta Func- 
tion, Biometrxka , Vol. 32, Part HI (1941), pp, 151-181. 

[6] S. S. Wilks, “The large-sample distribution of the likelihood ratio for testing com- 
posite hypotheses,” Annals of Math Stat., Vol. 9 (1938), pp. 60-62 

[6] Catherine M. Thompson, “Table of percentage points of the yf distribution,” Bio- 

metrika, Vol. 32, Part II (1941), pp. 187-191. 

[7] John W. Matjchly, “Significance test for sphericity of a normal multivariate dis- 

tribution,” Annals of Math. Stat., Vol. 11 (1940), pp. 204-209. 

[8] J. Wishart, “The generalized product moment distribution in samples from a normal 

multivariate population,” Biometrika , Vol. 20A, pp. 32-52 



CONTRIBUTIONS TO THE THEORY OF SEQUENTIAL 
ANALYSIS, II, HI 

By M. A. Girbhick 

United Slates Department of Agriculture 

Summary. This is a continuation of a paper Part I of which wa8 published in 
the June, 1946 issue of the Annals of Mathematical Statistics. The present paper 
is divided into two parts, Parts II and III, which arc summarized as follows: 

Part II. The Exact Power Curve and the Distribution of nfor Sequential Tests 
Where z Takes on a Finite Number of Integral Values. 

n 

Consider a sequential test defined by a decision function Z n = 2^ with 

or«=»l 

boundaries — b and a where a and b are positive integers and z a is the ath ob- 
servation of a variate z which takes on a finite number of integral values ranging 
from the negative integer — rto the positive integer m with respective probabili- 
ties p-r , ■ ■, Pm ■ Let £„,• = P[Z n = (a + z)], (i = 1, 2, • • • ,m — 1), and = 
P[Z n = - (6 + j)], 0 - 1, 2, • • • , r - 1). Furthermore, let A be a square matrix 
of a + b — 1 rows and columns with elements defined by: a„ = 1 — p o for all v, 
a, ■,,+*: = — Pa- for k = 1, 2, ■ • m.; = — p-jiorj = 1, 2, • • •, r, and a i} = 0 

otherwise. 

It is proved that 

(0 ~ ^2 Pi-rA r -j — , (j ~ 0, 1, ■ , r 1) 

1»0 

m — 7— 1 

(^) = Pi-4 j+l-^a+fr— f— 1,6 ) (j = If -Oi 

i-O 

where Am is the element of the Ath row and bth column in A~ l . Let E a ,r" 
and Eb,r n be the conditional generatiftg function of n under the restriction that 
Z n = (o + j) and Z„ = —(6 + 7) respectively. Then ^b,E b ,r n is obtained by 
. substituting rp, for each p, occurring in equation (i) and is obtained by 

substituting rp, for each pj occurring in equation (ii). The probability that 
Z n — a + j in exactly n steps is given by the coefficient of r 71 in the expansion of 
% a jE aj r n in a power series in r. The probability that Z n — — (b -f j) in exactly 
n steps is similarly obtained. 

This method is applied to the derivation of the exact power function and the 
distribution of n for the sequential binomial probability ratio test, 

Part III. On Conjugate Distributions. 

Consider a random variable X with a distribution density f(x, d) which satis- 
fies certain specified conditions. Let 0, and 0 2 be two values of 0 and let z = 
log (fix, d 2 ) /fix, 91)). For any hypothesis 9 = 9', let ip(t | 9') be the moment 
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generating function of z and h the non-zero value of t for which <p[t \ 8') = 1. 
We set F(x) =e hz f{x, 6'). Then / and F are conjugate distributions. If 
F = fix, 6"), then 6' and 0" are defined as conjugate pairs. 

A method is given for obtaining the totality of conjugate pairs for the general 
class of distributions which admit a sufficient statistic ‘It is then shown that 
the power of the sequential probability ratio test based on such distributions is 
given explicitly in terms of these pairs. It is proven that within the approxima- 
tion obtained by neglecting the excess of \Z n \ over a and b at a decision point 
the following relationship holds: 

Pb(n | F) = e~ hl Pb(n \ f) 

P a (n | F) = e h °P a (n | /) 

where Pb(n \ g) and P„(n | g ) stand for the probability that Z„ > a and Z n < —b 
respectively in exactly n steps under the hypothesis g. 

II The Exact Power Curve and the Distribution oe n for Sequential 
Tests Where z Takes on a Finite Number of Integral Values 

2.1. General discussions. Let a sequential test be defined by a decision func- 

n 

tion Z n = 2 z a with boundaries — b and a where a and b are positive and z„ 

O' “I 

is the ath observation of a variate z which takes on a finite number of integral 
values, — r, r + 1, • • •, — 1, 0, 1, 2, • • •, m. Let P(z = i) = p, where P{z = t) 
stands for the probability that z takes on the value i. We shall assume without 
any loss of generality that a and b are integers. 

When the sequential test terminates with Z n > a, the possible values that Z„ 
can take on are: a, a + 1, ■ • a m — 1. Similarly, when the sequential 
test terminates with Z n < — b, the possible values which Z n can take on are: 
-b, — ( b + 1), • • , -(6 + r - 1). Let = P[Z n = (o -f i)], i = 0, 1, • •, 
m — 1, and it, = P[Z n = — {b + i)], i = 0, 1, • • ,r — 1. 

For any variate u, let the/ symbol Ebfu) stand for the expected value of u 
under the restriction that Z n = — (6 + i), and the symbol E a ,{u) stand for the 
expected value of u under the restriction that Z n — a -\- 1 . Let <f>(t) be the gen- 
erating function of s. Then 

m 

(2.101) 0(0 = £ vS. 

i*= — r 

In terms of the generating function, the Fundamental Identity (see section 
2.32 in [6]) can be written as 

r— 1 m— L 

(2.102) £ &r {b+t) EbMi)r n -I- £ taf +I E a Mi)r n = 1 . 

t “»0 »-0 

It follows from (2.102) that for all values of t for which 

m 

0(0 = £ p.o = i, 

lea — r 


(2.103) 
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r— 1 wv — 1 

(2.104) *(t) = 1 

i — 0 t »*0 

where t/<(i) is the generating function of Z n . 

In the paper “The cumulative sums of random variables” [2] Wald has given 
the following method for obtaining the probabilities £ a ; and . Let ti , ta , 

■ ■ ■, i r+m be the r + m roots of (2.103). Substituting these in (2.104) we get 
r-j-m linear equations in the r -f- m unknowns, f„< and . Thus, if the deter- 
minant of these equations is different from zero, the unknowns can be solved 
in terms of the roots of (2. 103) . In a similar manner, the characteristic function 
of n under the restriction that Z n — i can also be obtained. 

The above method has two disadvantages. First, it involves solving for 
all the roots of a polynomial which will often be of a high degree and second, it 
involves solving a set of linear equations with coefficients which are powers of 
complex numbers. 

The method outlined below is in many respects much simpler, It requires 
only the evaluation of one column of the inverse of a matrix of a + b — 1 rows 
and columns. The elements of the matrix are given explicitly and are either 
0, 1 or . This permits obtaining general solutions for special classes of 
sequential tests. 

2.2. Derivation of the exact power functions. We multiply <£(<) — 1 by t r 
and i l>(t) — 1 by i +r_1 and obtain two polynomials. 

m+r* 

(2.20D /(<) = r (p^ - 

y-o 

and 

(2.202) gif = £ far*- 1 - t b+r ~ l + £ f a ,t B+ * +T+ '- 1 

y-o 7—o 

where S, k = 1 when i = k and zero otherwise. 

By the Fundamental Identity, every root of J{t) is also a root of g{t). Since 
/(<) is of degree m + r and gif is of degree a+b+m+r— 2, it must follow 
that g(t) equals f(t) times a polynomial of degree a + b — 2. 1 That is, 

o+fc-2 

(2-203) g(t)=f(t) E cj 

where the c’s are undetermined constants. Substituting from (2.201) in (2.203) 
we obtain 

(2.204) g(t) = £ Qf 

y-o 

1 It is assumed here that f(t) has no multiple roots. The author conjectures that this is 
true for the polynomial under consideration for all Values of p 
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where 

(2.205) Q, - i (p_. - S ir )c,-t . 

<-0 

Comparing the coefficients of (2.204) with those of (2.202) and taking into 
account the fact that p* = 0 when k > m, and c* = 0 when A; > a + b — 2, we get 


r-j-l 


(2.206) 

£bj — Pi—r Cr— 3— 1 y 

t -0 

0 = o, l, • 

,r - 1), 

and 

(2.207) 

TO— j— 1 

iaj =* Pi+ j+1 i— 2 , 

T— 1 

© 

II 

£3 

■ ,m- 1). 


«-0 


Thus, if the c’s (we require only the first r and the last m) are determined, the 
probabilities f a , and ft, are also determined from (2.206) and (2.207). But, if 
we examine the structure of g(t) in (2.202) we see that the coefficients of all the 
powers of t from r to (a + b + r — 2) inclusive are zero except for the co- 
efficient of t b+r ~ l which is equal to —1 Consequently, if in (2.204) we set 
Qj = -8 ]: h +r -i , for all j = r, r + 1, • , a + b + r — 2, we shall have the 

required number of equations to solve for the a + b — 1 unknown c's. 

In view of (2.205) these equations can be written as 

7 

(2.208) 2 (®w — Pi-r)c 3 -j = , (j = r, • • • , a + b + r — 2) 

.-0 

Changing the range of the subscript j, we get 

j+r— 1 

(2.209) X) ( 8 ,r - p,_ r )c, +r _,_i = 6 , 1 ,, (j = 1, 2, • • ■ , o + b - 1), 

*-0 

with the understanding that pi = 0 when A; > m and c* = 0 when A; > a + b — 2. 

Let A be the matrix of the equations in (2.209). Then A is of the following 
form. The elements in the main diagonal are (1 — po). In the diagonals to 
the right of and parallel to the main diagonal, the elements are — p_i , — p_ 2 , • • ■ , 
— p_ r ,0, •••,0 successively, m the diagonals to the left of and parallel 
to the main diagonal, the elements are — pi , — p 2 , • • — p m , 0, ■ ■ 0 suc- 

cessively. Assume that the determinant of A is different from zero* and let 
A -1 be the inverse of A. Let the elements of A -1 be designated by Atj , (t, j = 
I, 2, • • •, a + b — 1). Then, in view of (2 209) we get 

(2 210) c, = Aj+i.i, , 0 = 0,1,2, ■■■,a + b - 2). 

Finally, from (2.206) and (2.207), we have, 

r -,-1 

(2,211) ft, = ^ Pi-r Ar-j-t.t , (j “ 0, 1, 2, ■ • • , r 1), 

i-O 


* P. L. Hsu has submitted a simple proof to the author that A is non-smgular. 
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and 

m—]— 1 

(2.212) £a; = ^ Pw j-f-i Ad+h- i—i, & f 0 “ 2» * * • j hi 1) 

where, as befoie, it is understood that p*, — 0 when k > m and Am, — 0 when, 
k > a + b — 1. 

From (2.211) and (2 212) we can obtain the probability that Z n < —b and 
the probability that Z n > a since these are given by 

r-1 m-1 / r— 1 \ 

Z fa and Z ( = 1 _ Z fij) 

)-0 ,-o \ f-o / 

respectively We can also obtain En, the average number of steps required 

to reach a decision For, if we differentiate (2,102) with respect to t and 
set t — 1, we get 

m—I r— 1 

-nn Z fa.(° + f) ~ Z &,(i> + l) 

(2.213) £?(n) = ^ 

Z 

<— r 

2.3. Derivation of the probability that the sequential test will terminate in 
exactly n steps. Let 4>(t) be the generating function of z and \p(t, r ) the joint 
generating function of Z„ and n. Then 

(2.301) <j>(t) = Z vJ 

r 

and 

r“— 1 m — l 

(2.302) vK<, r) = Z hi r ft+<) E bi r" + Z fa ^ E ai r\ 

1-0 1-0 

Furthermore, let #i(<, r) = r<f>(0 - 1 and ^i(t, r) = \p{t, r) — 1 In terms of 
these functions, the Fundamental Identity can be stated as follows: For a fixed 
r, every root of r) is also a root of ^(f, t). I Ait /(<, t) = f r d>i(f, r) and 
g(t, r) = t 6+r "Vi(<, r). Then 

m+r 

(2-303) /({, t) = Z (rp,- r - 5, r )t J 

J-O 

and 

(2.304) git, r) = Z (faE bi T 1> )r > ~ 1 ~ + Z ({„, E ai r n )t a ^ +,+) ~\ 

1—0 J — .0 

Since for a fixed r, every root of f{t, r) is a root of g{i, r), and since fit, r) 
is a polynomial in t of degree m + r and git, r) is a polynomial in t of degree 
a + b + m — 2, it must follow that 3 


1 See footnote 1, section 2.2 
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a+b — 2 

(2-305) g(t, T ) = f(i, r) £ da\ 

i-0 


The rest of the argument is identical with that of section 2.2 except that the 
unknowns in this case are h,E b] t" and £ ai E a ,i n and are given by 


(2.306) 

ftj Ei,r n 

T—j—l 

Tp x — r dr—]— i—l j 

1 — 0 

U = 0 , 1 , • 

— i). 

and 





(2.307) 

£*,E a] T n = 

m — j — 1 

^ T Pi-H+ 1 ^a+b— 1 — 2 y 

ft 

(i - 0 , 1 , • • 

■ » m - 1), 


(see (2.206), and (2.207)) where the d’s are obtained by solving the linear equa- 
tions: 

j+r— 1 

(2.308) £ (fi,- r ~ Tp,-,) = 5 jt , (j = 1, 2, • , o + b ~ 1), 

(see (2 209)). Thus, we see that the solution for ft n E bl r n is obtainable from 
the solution given in 2.2 for ft, by substituting r p,- for every p, appearing m the 
expression (2.211). Similarly, the solution for r" is obtainable from the 
solution given for ft, by substituting rp, for every p, appearing in the expression 
(2.212). 

Let p(Z„ - k\n) stand for the probability that Z„ - k in exactly n steps and 
let Pai(n') = p[Z n = (a + i) | n] and p 6 ,(n) = p[Z„ = - (b + i) \ n]. Then 
p„,(n) and pi,(n) are given by the coefficient of r n in the expansion of r 71 
and h*E h iT n respectively in a power series in r. That the expansions are valid 
can be seen from the following considerations: If we examine the solutions given 
for %aiE aiT™ ( i = 0, 1, • ■ •, m — 1), and ftt-Eb, r n (i — 0, 1, • • ■, r — 1), we see 
that each is a ratio of two polynomials in r, the polynomial in the denominator 
is, in each case, the determinant of the linear equations (2.308) . Now, it is easy 
to see that this determinant eqals 1 when t = 0. Hence the expansions are 
valid in a neighborhood of r = 0.* 

Let p on = p[Z n > a | n] and p bn = p[Z n < — b | n]; then 

m — 1 

(2-309) p an = £ p a ,(n) 

»-0 

and 

(2.310) = £ pbi(n). 

t-0 

We have also : 

(2.311) £ Pan = £ £«i = p(Z n > a) 


* It can be seen from (2.303) that for & fixed r,/(t, ,) = 0 implies that <p{t) =■ 1/t. Hence 
if r < 1, <p(t) > 1. Thus, the Fundamental Identity is valid in the neighborhood of r — 0. 
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aad 

to r — 1 

(2.312) 2 = £ &. = P(2n < -6) 

mj o 

where rm is the smallest integer greater than or equal to a/m, and m 2 is the 
smallest integer greater than or equal to b/r. 


2.4. Application of the method to the binomial distribution. We shall 
consider the binomial in terms of acceptance inspection although the results 
are general. 

Let a sequential acceptance inspection plan be defined by pi , pi , a and ft 
where pi is the fraction defective which can be tolerated in the lot, p 2 is the frac- 
tion defective which cannot be tolerated, a is the maximum probability that the 
lot will be rejected when the fraction defective is pi or less and p is the maximum 
probability that the lot will be accepted when the faction defective is p £ or 
greater. Then the sequential criterion is given by two parallel lines ([1] and [3]). 

(2.401) di = — hi + sn 

(2.402) di — hi + m 
where 


(2.403) 


(2.404) 


h = 


log 


1 - 
P 


log 


Pz(l - Pi) 
Pl(l - Pz) 
1 - p 


log 


fo = 


log 


Pz(l - pi) 

Pl(l - Pi) 


(2.405) 


s 


log 


1 - Pi 
1 — Pi 


log 


Pz 0- - Pi) 
Pi (1 ~ Pz) 


and n is the number of observations taken sequentially. We assume that 
a + p < 1 and pi < p 2 . Then hi and hi are positive and s lies between 0 and 1. 

The sequential procedure is as follows: Items arc examined one at a time in 
sequence. If at any stage, the cumulative number of defectives found in the 
sample thus far taken is less than or equal to di given by (2.401), the lot is ac- 
cepted; if the cumulative number of defectives is greater than d z given by 

(2.402), the lot is rejected; if neither holds then another observation is taken 
and the process continued. 

It is easy to show that the sequential test described above is equivalent 
to the following: A variate 1 z takes on the values — s and 1 — s with respective 
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probabilities q and p. A sequential test is defined by the two boundaries —hi 

n 

and hi and by the decision function Z n = ^z a where z a is the ath observa- 

a— 1 

tion on z. The sequential test terminates if and only if Z n < — hi or Z n > fh . 

As was mentioned above, s lies between 0 and l. 6 We shall derive the exact 
power and the distribution of n for this sequential test by assuming that s = 
u/v where u and v are integers and u < v. This restriction is not serious since 
every value of s can be approximated to any degree of accuracy by a rational 
fraction, and, moreover, when the sequential test is applied in practice, s is 
always taken as rational. 

Suppose s = u/v. Then the sequential test is equivalent to a test in which 
the variate z takes on the values — u and v — u with probabilities q and p, 
respectively, and the boundaries are given by — h\V and hv. Let b be the small- 
est integer greater than or equal to h x v and a be the smallest integer greater 
than or equal to fwv. Then, since u and v are integers, there is no loss in gen- 
erality in assuming that the boundaries are —b and a. We shall also assume 
that u and v are prime to each other (i.e. the fraction u/v is reduced to lowest 
terms) so that the interval ( — 6, a) is the shortest possible for this test. 

The above discussion shows that a sequential test based on the binomial 
can be considered as a special case of the class of tests treated in this section. 
Smce z takes only on two values, the linear equations (2.209) assume the simple 
form: 

(2.401) — pC J+tl _,_ i + (7j_i — qC J+u -i = 8i > 3 , 0 “ 1, 2, ■ ■ a + b — 1) 

where C* = 0 when k is negative or greater than a + b — 2. In terms of the 
C’e, the and E a , are given by 

(2.402) (bj = qC u _j^i , 0, = 0j 1, • • ■, u 1), 

(2.403) — qC'a+b+u—v+j—i t 0‘ = 0, 1, •••,v — u 1) 

The conditional generating functions of n are obtained by solving (2.401) 
with rp substituted for p and rg substituted for g. 

Since the first v — u and the last u equations in (2.401) contain only two 
terms and all the other equations contain only three terms, the C’s can be ob- 
tained without too much difficulty by direct substitution provided a + b is 
not very large. When a + b is sizeable, a general solution is called for. So far, 
the author has been able to obtam this only for the case u = 1. This special 
case also has been considered by Walter Bartky [4]. 

Setting u = 1 in (2.401) we get 

(2.407) -PC,., + Cj-i—qC, - S bj , (j = 1, 2, ■ * •, o + b - 1), 

where C* = 0 when k is negative or greater than o + b — 2. 


* In fact, it follows from Theorem 1, section 3.2 below that pi < s < pi . 
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Consider a general set of equations of the form (2.407) with the subscripts 
ranging from 1 to an arbitrary integer k. Let the determinant of these equa- 
tions be designated by A* . Then by direct expansion it can be shown that 
A* satisfies the difference equation. 

(2.408) A fc = At— i - pq'^Ak-, 
with the initial conditions 

(2.409) A, = 1, 1, 2, 1; A, - 1 - pg’" 1 . 

The difference equation (2.408) can be solved by well known methods. We Bet 

(2.410) = £ A;i /-1 

and then multiply each side of (2.410) by 1 — x 4- pq v ~ l x'. This yields 

00 

(2.411) (1 - x + pjf-VW*) = £ (A; “ A/_i + pq'~ l A/-,]*'" 1 . 


But by (2 408) and (2.409) we find that the right-hand side of (2.411) equals 
1 — pq v ~ V" 1 . Therefore, 


(2.412) 


*(*) = 


1 ~ Pg » 

1 — x + pq’~ l x v 


If we expand (2 412) in a power series in x, the coefficient of x h will bo A* +l , 
This expansion can be performed readily and we get: 


mi mj 

(2.413) Ajt +1 - X) (-1 - £ ( -l)'cr~ y, ’~ 1,+ W-y +I 

1-0 1-0 

where mi stands for the largest integer less than or equal to,k/v mj stands for the 
largest integer less than or equal to k— v + l/v and Cl —r\fl\(r — t ) !. 

Let us define Ao = 1 and A* = 0 when k < 0. Then, in terms of the extended 
definition of A fc , O s is given by 


(2 4141 C > = A/-j,A„ + t_i 

V W r^A^ 

for j = 0, 1, • • •, a + b — 2. To prove this, we substitute in the left-hand 
member of (2.407) the expression for C k given in (2.414) and get 

(2.41 5) A«+>-i(A/-> — Aj-j-i -b pq Aj-u-t) A a _i(Ay Ay_i 4* pq A/-,) 

q^Aa^i ~~ ~~ 

But in view of (2.408), (2.409) and the extended definition of A* , the expression 
in (2.415) vanishes for all j ^ b. When j = b, [the expression equals 1. Hence, 
it follows that (2,414) is the desired solution. 

Let L p = p[Z n < — 5] . Then L v , when plotted against p, gives the operating 
characteristic curve for this sequential test. But L p = qCa . Hence, we have 
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(2 416) L p = q b ■ 

A a +i-l 

As a final remark, we wish to point out that the solution to the sequential 
problem presented in this section, where taken in conjunction with Wald’s 
solution, is of mathematical interest, since it relates each element of the inverse 
of a square matrix (designated by A in this section) with the roots of a poly- 
nomial /(«) given by (2.201). 


III. Conjugate Distributions 

3.1. General discussion. Consider a random variable X with a distribution 
density /(x, 6) . 6 Let ft and ft be two specified values of 6 and let 


(3.101) 


2 = log 


f(x, 62) 
f(x, ft) ' 


For any hypothesis 8 = 6', let 4>{t | 6') be the moment generating function 
of z. That is, 

(3.102) <t>(t \6') = f e“f(x, 8’) dx. 

J— oO 

Let h be the real non-zero value of t for which <t>(t | 8') = l 7 and let 

(3.103) F(x) = e h ’f(x, 9'). 

Then F{x) is a distribution density. Following Wald [5], we Bhall call F{x) 
and /(x, 6') conjugate distributions. 

The distribution density F(x) depends on ft , 8% , and 8'. In some instances 
Fix) will be a member of the class of distributions fix, 6). This is the case, 
for example, when z is a discrete variate. It is the case also if 8' = ft . For 
then h = 1 and F(x) = f(x, ft). If F(x) belongs to the class of distributions 
f(x, 8), we shall designate Fix) by fix, 8") and call 8' and 8" a conjugate pair. 


3.2. Conjugate pairs and the power curve for sequential probability ratio 
tests in which the underlying distributions admit a sufficient statistic. Let 

f(x, 6) admit a sufficient statistic and let a sequential test be defined in terms 
of the piobability ratio z given by (3.101) for some specified hypothesis ft and 
alternative hypothesis ft with ft < ft . Let the boundaries be given by —6 
and a where a and b are positive. Since f(x, 6) admits a sufficient statistic, 
it can be written in the form 

(3.201) f(x, 8) = 

The probability ratio 2 is then given by the simple expression 

(3.202) z = u(x)[v{6 2 ) — w(ft)] + w(ft) — w(ft). 


• If X is discrete, then/(x, 9) stands for the probability that X «= x when 9 is true 
7 See section 2,31 and Lemma II, section 2 32 in [6] 
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Let 


(3 203) 

b* - ^ 

v(8f) — v{8i) 

(3.204) 

* _ a 

v(Bt) — v(8i) 

(3,206) 

_ w(8i) - w(8i) 
v{8i) - v{8i) 

In terms of b*, a* 
lines' 

and s, the sequential criterion is defined by two parallel 

(3.206) 

A n = — b* + sn 

(3.207) 

ft n = a* + sn 

n 

and the decision functions £ u(£«)* The hypothesis 9 « Ox is accepted 

ft-i 

T» 

whenever ^ u(a :«) 

a -I 

n 

< A n and rejected whenever 2D u(x a ) > R n . If 

B-l 

n 

A„ < £u(x*) < 

V. 

R„ , another observation iB taken. This process is con- 

tinued until one or the other decision is reached. 

In what follows, we shall restrict ourselves to the general class of functions 
f(x, 6) for which the differentiations under the integral sign indicated below 
are permissible and v(8) is a monotonic function of 8. 

Consider the function 

(3.208) 

i p{8) = sv(8) + w(8). 


We shall show that ip(8) = constant has exactly two roots in 6. To this end, 
we prove the following theorems. 

Theorem 1. Let Eu(cc) j 8 he the expected value of u(x) under the assumption 
that 8 is true. Then there exists a value of 8 = do such that (o) Eu{x) | 8 0 = s; 
(b) < do < 6i and Eu(x ) | 8^ < « < Eu(x) | 0 5 if v(8) is an increasing func- 
tion of 8, and the inequalities are reversed if v(8) is a decreasing function of 8. 

Proof: Assume that v(8) is an increasing function of 6. Let e* — u(x) — s 
and let <j>(t) j 6 be the moment generating function of z* under the hypothesis 
that d is true. Then, it is easy to see that 4>{h | 6j) = 1 and <h(—h | 0 2 ) = 1 
where h = v(Bi) — v(0i). Since h is positive, it follows by Lemma 1, section 
2.6 of [6], that Ez* \ 8i < 0 and Ez* j 0 2 > 0. Therefore, Eu(x) j ^ < s and 
Eu(x) | > s. Moreover, as we shall see in the proof of Theorem 2 below, 

Eu(x) | 8 is assumed to be a continuous function of 8 and proved to be mono- 

• It is here assumed that v(6 t ) — v(6i) > 0, If this is not the case, then o* and b * have 
to be interchanged. 
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tonieally increasing. Hence it must follow that there exists a 0 = 0o such that 
Eu(x) | 8 0 = s and 6 X < 6 0 < 6 2 , This proves the theorem in case v(8) is mon- 
otonically increasing. However, the argument is identically the same in case 
v(9) is monotonically decreasing. 

Theorem 2. Let \ p ( 8 ) be defined as in (3 208). Then \ p ( 6 ) is a monotonically 
increasing function of 6 in the interval 6 < 6 a ; assumes a maximum at 6 = 6 0 ; 
and is a monotonically decreasing function of 6 m the interval 6 > 0 O . 

Proof. If we differentiate twice the identity 


(3.209) 





with respect to 6 we get 

(3.210) v'(e)Eu(x) | e + w'(e) = o 

and 

(3.211) v f, (8)Eu(x) | 6 + w"(6) = [t/(0)]V u(l) 

where <r„ & is the variance of u(x). Also, if we differentiate under the integral 
sign the function Eu[x) \ 0 with respect to 6, we get 

dEu(x) | 6 _ 

(3.212) — ^ — — — a(0)tr u (*). 

Now by hypothesis, v(8) is monotonic in 0. Hence from (3.212) we see that 
Eu(x) | 6 is also monotonic. Moreover, if v(9) is an increasing function Of 0, 
so is Eu(x) | 8, and conversely. Let us assume that v{6) increases with 9. 
Then for all 8 < da , Euix) \ 8 < s and for all 6 > 8 g , Eu(x) \ 8 > s. Conse- 
quently, we have 

(3.213) xp'{8) > v'{8)Eu{x) \ 8 + w'{8 ) 
for all 6 < 8„ and 

(3.214) i'(8) < v'(8)Eu( x) | 9 + w'(9) 

for all 8 > 8 a . But by (3.210) the right-hand side of these inequalities is equal 
to zero for all 6. Hence fi'{8) > 0 for 8 < 8 0 and \1'(0) < 0 for 8 > 8 0 . The 
same argument holds when v{6) is a decreasing function of 8 . Now let 8 — 0 a . 
Then by (3.210), we see that ip'(do) = 0. Hence, ^(0) is a maximum at 0 = 0 O . 
This proves the theorem. 

Let c be any constant < ^(0 O ) within the domain of ^(0). Then by Theorem 
2, the equation \p{8) = c has two roots in 0. Let these roots be designated by 
0' and 6". We now prove the following theorem. 

Theorem 3. Let z* and <j>(t \ 6) be defined as above. Then (a) <p(t [ 0') = 1 
fort = v($") — v(9' ) ; ( b ) <p(t | 8") = 1 for t = i>(0') — v(6 ") ; and (c) 9' and 8" 
from a conjugate pair with respect to z*. 
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Proof: By definition 

(3.215) <t>(t | O') = [ 

J- oO 

Now let t = v(d") — v(d') = h, 
we get 

(3.216) 4>(h \6') = [ 

j— 00 


Then, in view' of the fact that 

M )+r(ae)+u>(d") ^ ___ ^ 


m 


an, 


In a similar manner, it can be shown that tf>( — h j 9") = 1. Mm cover, the same 
argument also shows that /fir, 8 n ) = e h ‘‘f(x, 6'). This proves the theorem. 
Turning now to the sequential test defined by (3.206) and (3.207), we see that 

n 

it is equivalent to a test with the decision function Z* = 2^ z a and the two 

a “1 

boundaries — b* and a*. Let Lg be the probability that the sequential test will 
terminate and Z* n < —b* (i.e. the hypothesis By is accepted) when 9 is true. 
Then (neglecting the fact that at a decision point Z n might exceed a * or fall 
short of — b*), Lo> and Lg" are given by (see for example (2.406) in [6]). 

<«•+&*>& _ b‘h 

(3-217) U = 6 -r a .w-V 


and 

-A(«H6*) _ -W >* 

(3.218) Lg" = - _ A 7 a . - + p) — = e~ h h L t > 

6 " X 


where h = v{9") — v(d'). Thus, we see that the two roots of the equation 
\p(8) = c determine two points on the power curve for the sequential test. By 
assigning various values to c we obtain as many pairs of points as desired. 

The above results show that for the class of distributions under consideration, 
the real non-zero roots of <p(t j d) = 1 are obtainable from the roots of \p(9) — 
constant. Since \f/{6 ) is completely defined by the form of the distribution 
/( x, 8), the power curve of the sequential test can be obtained without a knowl- 
edge of the moment generating function of z*. This might be advantageous 
in some cases. 


3.3. The distribution of n under conjugate hypotheses. Lz\ P&(n ( g) stand 
for the probability that a sequential test will terminate with Z n < — b in exactly 
n steps when the distribution density of x is g. Let P a (n | g) be similarly defined. 
Theorem 1. If we neglect the excess of Z n over a and —b at a decision point, 

(3.301) Pb(n\F) = e- hi P h (n\f) 

(3.302) P a (n\F) = e**P Q («|/) 

where f and F are conjugate distributions as defined in (3.103) and h is the non-zero 
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real value of tfor which 4>(t\f), the characteristic function of z = log fix, B 2 )f f(x,6i) 
underthe hypothesis f, equals 1. 

Proof: Since, by definition, F = e‘ k f, it follows that f/{t — h \ F) = 4>(t \ f) 
where \p(t \ F ) is the characteristic function of z under the hypothesis F. Let 

(3.303) <t>(t\f)=e~ T 

where t is a pure imaginary. Furthermore, let ti(r) and fe(r) be the roots of 

(3.303) such that limfi(r) = 0 and lim k(r) = h (see [2], page 289). Then 

r -»0 t -*0 

<i(r) — h, and fe(r) — h will be the corresponding roots of 

(3.304) f/(t | F) = e -f . 

Now by the Fundamental Identity we have 

(3.305) L/e~ l>llr) E b /e rn + (1 - L f )e at ' (T) E of e rn = 1 

(3.306) L f e- bl,(T) E bf e rn + (1 - L f )e atl{r) E af e rn = 1 
and 

(3.307) L, e~ blHW ~ h] E ir e Tn + (1 - L,) e “ [,l(T) '' M S 0 , e Tn = 1 

(3.308) L r e- bl,llT) -"E hr e Tn + (1 - L r )e aU ' M -"E ar e tn = 1 

where L, = P[Z n < —b\j], E b/ stands for the expected value of e rn under the 
hypothesis / and the restriction Z„ < — b; E af stands for the expected value of 
e ,n under the hypothesis / and the restriction Z n > a; and the symbols L v , 
E bP and E ar are similarly defined. 

By comparing equations (3,305) and (3.306) with (3.307) and (3.308) we 
see that 

(3.309) L/rEbre Tn = e hb LfE b fe Tn 
and 

(3.310) (1 — Lf)E a re Tn = e^(l — L/)F 0 /e T?l . 

Since the above relationships hold for the characteristic functions of n, they must 
also hold for the distribution of n. This proves the theorem. 

If we set t = 0 in (3.309) and (3.310) we also get 

(3.311) L r = e~ hb Lf 
and 

(3.312) 1 — L r = e“(l - Lj). 

In view of (3.311) and (3.312) we see from (3.309) and (3.310) that 

(3.313) E br e rn = E b/ e ,n 
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and 

(3.314) E ar e ,n = E af e rn . 

From (3.313) and (3.314) we obtain the following rather surprising theorem. 

Theorem 4. Except for the approximation indicated in' Theorem 1, the con- 
ditional distribution of n under the restriction that Z n < — b as well as the restric- 
tion that Z n > a is identical for the two hypotheses F andf. 

The above theorems are of particular interest when F is a member of the 
class of distributions /, In any given sequential test the results of Theorem 1 
can be used to facilitate the computation of the probabilities of making a de- 
cision. Furthermore, the results of Theorem 4 show that the conditional dis- 
tribution of n throws no light on the parameter 6 involved in the distribution 
of z. This follows since the conditional distribution of n is identical for the con- 
jugate pair 0' and 0", and, in any practical problem, 0' and 0" will represent 
opposing hypotheses. 

We shall now establish exact relationships of the type considered above when 
the variate z takes on a finite number of integral values. 

Let z take on the values — r, — r + 1, • • • , — 1, 0, 1, 2, • • • , m with P(z = i) = 
p, , Furthermore, let P, = e h 'pi where h is the real non-zero root of 

(3.315) Ep.<"“ 1. 

»— ~r 

Then the probabilities Pi and p, are conjugate. We set e‘ = u and define 
4>(u | 0) to be the generating function of z under the hypothesis p{z = i) = Oi . 
Then 

(3.316) <j>(u | p) = Ep.’! 1 

r 

and 

(3.317) *(« | P) - £ P>u' = £ Vi (e h u) ( 

*— r — r 

Consider a sequential test defined by two boundaries —b and a. and a decision 
» 

function Z„ = 2 z » Let & and £, stand for the probabilities that Z„ = 

a»l 

— (b + i) and Z„ = a + i respectively under the hypothesis that 0,- = P(z — i). 
Furthermore, let P i,(n | 0) and P„,(n | 6) stand for the probabilities that Z„ = 

— (b + t) and Z„ = (a -f- i) respectively in exactly tv steps, under the hypothesis 
0, = P(z = i). Also, let the symbols Eli and Eli stand for conditional expecta- 
tions under the hypothesis 0, = P{z = i) and under the restriction that Z* = 

— (b + t) and Z„ — a-\- i respectively. 

Since z takes on a finite number of integral values, the Fundamental Identity 
for the two conjugate hypotheses, p and P can be written as: 
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(3.318) 
and 
(3 319) 


ti 


E &u (b+ ' ) EZ{<t>(u I p)]~ n + E f a p ,w“ + ‘E o p ,[0(u I p)H = 1 


r— 1 


E fr.n- (t+<) Er,[0( U ! p)i- 


+ 


E 

t-0 




p)] _B = i. 


For any real number r let Ui(r), u 2 (r), • • , u r+m (r) be the r + m roots of the 
equation: 


(3.320) 


*(« I P) 



1 

T 


Then, in view of (3.317) the corresponding roots of 


(3.321) 


*(u|P) = £ P,u‘ = - 


are given by ui(r)e \ tij( r)c K , u r+m (r)e * Substituting these roots in 
(3.318) and (3.319) successively, we get 

(3.322) E if. t*,(r) _tM 0 EL r" + E fa”, tt, (t)° +, \E d p , r" = 1 


and 
(3 323) 


E ^.[«,(T)e-r ( ‘ +0 ^r,r B + E fr.fM 3 (r) e ~r +, 2?r.T" = 1 

i-0 


for j = 1, 2, , r + m. Since the roots u,(r) are assumed to be known, the 

unknowns in (3 322) and (3 323) can be solved m terms of these roots provided 
the determinant of the equations is different from zero. But in section 2, we 
have indirectly shown that for a sufficiently small t, the determinant is dif- 
ferent from zero. Thus, assuming that the solution has been obtained we see 
from (3.322) and (3 323) that 

(3.324) £f, El r" = EL r" 

and 


(3 325) 


CEl r” 


e ia+ '^LE 


P 

at 


T . 


Setting r = 1, we get 

(3.326) 
and 

(3.327) £f, = e fc(a+,) f;. . 

Moreover, if we expand the expressions in (3.324) and (3.325) in a power series 
in r (which by section 2 is permissible), and compare coefficients of r* we get 
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(3.328) P b ,(ii|P) = «'‘ w ?.,(n|?) 


and 

(3.329) P„WP) = e hW P„(fi|l)). 


[1] Abraham Wald, "Sequential tests of statistical hypotheses,” Annals of Math, StaL, 

Vol. 19 (1045), pp, 117-186. 

[2] Abraham Wald, "On cumulative Bums of random variables/' Annals of M Sk, 

Vol, 15 (1044), pp, 283-200 

[3] Statistical Research Group, Columbia University, Sequential Analysis o/ SlA 

tied Data, Applications, Columbia Univ. Press, 1945, 

[4] Walter Bartxt , "Multiple sampling with constant probability," Annals of Math, Slat,, 

Vol 14 (1943), pp. 303-377, 

[5] Abraham Wald, "Some generalizations of the theory of cumulative sums of random 

variables," Annals of Mi StaL, Vol, 16 (1945), pp, 287-293, 

[6] M, A, Girshick, "Contributions to the theory of sequential analysis, I," Annals of Math 

StaL, Vol, 17 (1046), pp, 123-143, 



SUFFICIENT STATISTICAL ESTIMATION FUNCTIONS FOR THE 
PARAMETERS OF THE DISTRIBUTION OF MAXIMUM VALUES 

By Bradford F. Kimball 
New York State Department of Public Service 

1. Summary. The problem of estimating from a sample a confidence region 
for the parameters of the distribution of maximum values is treated by setting 
up what are called "statistical estimation functions” suggested by the func- 
tional form of the probability distribution of the sample, and finding the moment 
generating function of the probability distribution of these estimation “functions. 
Such an estimate by the method of maximum likelihood is also treated. 

A definition of “sufficiency” is proposed for “statistical estimation functions” 
analogous to that which applies to “statistics.” Also the concept of “stable 
statistical estimation functions” is introduced. 

By means of a numerical illustration, four methods are discussed for setting 
up an approximate confidence interval for the estimated value of x of the uni- 
verse of maximum values which corresponds to a given cumulative frequency 
.99, for confidence level .95 Two procedures for solving this problem are 
recommended as practicable 

2. Introduction. If the univeise comprises a set of maximum values of a 
large number of quantities, it has been shown that in many cases the probability 
density function of such a set of values of a; is given approximately by 

(2.1) f(x) - ', t = a(x — u), - °° < x < -f o° , 

where a and u denote parameters, usually unknown [1], 

This paper is concerned with the problem of estimation of the parameters 
a and u on the basis of sample data. 

The notion of "sufficiency” is fundamental in the problem of estimation, 
since it means that the necessary elements of the sample have been used which 
will result in complete determination of that part of the sample probability 
distribution function involving the unknown parameters to be estimated. 
Unfortunately it does not seem to be possible to set up “sufficient statistics” 
within the usual definition of "statistic” for the above distribution. In this 
investigation the writer was struck by the fact that certain functions of the data 
involving one of the parameters could be used to play a very similar role to a 
6et of sufficient statistics for determining a and u, in spite of the fact that one 
function involved the value of a, and hence was not directly determined by the 
data, — and hence not a “statistic.” 

Various statistics have been used in the past to estimate the parameters a 
and u, such as the sample mean, variance, mean deviation and an adjusted 
modal value (see [2] and [3]). For the reason noted above, sufficient statistics 
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have not been developed. In order to bridge this impasse and meet the es- 
sentials of the condition of sufficiency, the writer believes that a broader defini- 
tion of sufficiency is needed. Such a definition is developed in the following 
section. 

3. A broader definition of sufficiency. If the reader reexamines the process 
of estimating the two parameters of the normal distribution, and the deter- 
mination of the two parameter confidence region for them from the statistics 
consisting of the sample mean, and the mean square deviation of the sample 
values from their mean, he will find that the separate determination of £ and 
s 2 is not inherently necessary. The mean a and the variance or 2 of the universe, 
are usually estimated from the pair of equations 

*?(£) = a, E(s l ) = (» - l)<r 2 /n 

and the boundary of the confidence region is determined from knowledge of 
the bivariate distribution of x and s, which involves the four variables £, s, 
a , and <r. The equation of the bounding curve is most easily set up in terms 
of transformed variables such as 

(3.1) U = y/n (£ — a) I a, V = \/ns/a. 

Then the probability density of U and V is given by 

f(U,V) = (const.)y n -V' t,,+ *' ,),i 

and with confidence coefficient /9; a bounding curve may be defined implicitly 
by the two equations 


/ //(U, V) dUd-V = p, 
f(Ui,Vi) = constant 

where the above integral is taken over the region of the V ^ 0 half of U,V 
plane bounded by the curve f{Ui ,Vi) = constant. 

A range of estimate of the parameters a and a is offered by this confidence 
region by virtue of the fact that each point of the region corresponds to a unique 
pair of values of a and <r for a given set of Bample values 0 n (*i), and the fact 
that the equation of the bounding curve does not involve the parameters a and <j. 
Thus one arrives at a determinate range of estimate of a and <r, after the sample 
values have been observed. In this paper such fun'ctions will be referred to 
as statistical esvimation junctions (see [4]). 

The classical idea of sufficiency implies (a) that the estimate be adequate 
for unique determination of the parameters, and (b) that all the sample in- 
formation pertinent to such estimation be used. In the case of “statistics” 
the second requirement has been simply and elegantly formulated by the 
requirement that the probability density function of the sample distribution 
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factor in such a way that one factor be completely determined by the statistical 
estimates and the parameters of the distribution, and that the remaining factor 
be independent of the parameters to be estimated (see [7], or [5] p. 135). 

It seems to be possible to carry over this formulation to statistical estimation' 
functions (denoted by T{) . Since one or more of the parameters to be estimated, 
denoted by (a* , a^ , - • •, a r ), are involved in these functions, a requirement that 
they be adequate for unique determination of these parameters is obviously 
that there be a one-to-one correspondence between the parameter set (ai , a 2 , 

■ ■ a r ) and the set of estimation functions (Ti , jP 2 , ■ • •, TV) in the region of 
estimate. This requirement will be referred to as Requirement (1). 

It has been pointed out by a referee that some further requirement as to the 
independence of the probability density function of (Ti , f 2 , > ■ -, T r ) relative 
to the parameters to be estimated is needed. 

If one requires that the p. d. f. of (7\ , T t , • • •, T t ) be entirely independent 
of the parameters (oi , a 2 , • • •, a r ) the estimation functions ■will furnish “con- 
fidence regions” for estimates of the parameters; — see example noted above 
for the normal distribution. 

However, in some cases the mean values E(T ,) may be independent of the 
parameters, while the p. d f. may not be; for example, — estimation functions 
for the two parameters of the Pearson Type III distribution formed from the 
maximum likelihood functions of that distribution. In such cases, a point 
estimation of the parameters is still possible, and would seem to satisfy the 
classical requirements of sufficiency. 

The author accordingly makes the following proposals: 

(a) Statistical estimation functions that satisfy the first two requirements — 
that of one-to-one correspondence with the parameters to be estimated, and the 
factoribility condition — be termed sufficient for estimation of the parameters. 
The reasonableness of such a definition is strengthened by the observation 
that given a set of “sufficient statistics” in the classical sense, statistical estima- 
tion functions that satisfy the factoribility condition can always be formed from 
them, and hence they are subject further only to Requirement (1) to make 
them sufficient statistical estimation functions under the proposed definition. 

(b) Statistical estimation functions that satisfy Requirement (1) and also 
have a p. d. f. which is independent of the parameters to be estimated shall be 
called stable — a term suggested to the author by a referee. 

(c) Statistical estimation functions T t that satisfy Requirement (1) and are 
such that E(T t ), (i = 1, 2, • ■ -, r), be independent of the parameters to be esti- 
mated, be called stable m mean, and that similarly, if the modal or median 
value's of T, be independent of these parameters, they be called stable in mode, 
stable m median, etc 

Thus a definition of sufficiency applicable to statistical estimation functions 
is formulated as follows: 

The term “statistical estimation function” will be used to denote a function 
of the sample values and one or more population parameters, used for purposes 
of statistical estimation 
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Given a universe with probability density function involving m parameters 
ai , < 12 , • ■ ■, a n in an admissible region R, and a set of r statistical estimation 
functions T,( 0„ ; a y , ai, a m ) to be used for estimating the r parameters 
ai, a 2 , •••, Or relative to the information in a given sample 0„ . Consider 
the conditions: 

(1) The functional form T< insures a one-to-one correspondence between 
the points of the r-parameter space (ai , a 2 , - • • , a r ) contained in R and the points 
of the r-space defined by (Ti , Tj , ■ • T r ) for fixed 0 „(£,■) and fixed parameter 
values flr+i i Or*f "2 , * * * , a^ • 

(2a) It shall be possible to express the probability density function of the 
sample 0 n as 

.P(On) = @l{T 1 , Ti , ' ‘ ' j T r j fll , 0-2 , ' * U m ) * Ql (0 n , Or-f-1 j Ur+ 2 , *, Sm)j 

where the first factor is uniquely determinable for fixed ( 01 , 02 , ■ • ■, a m ) from 
the corresponding values of the functions T , , and the second factor is inde- 
pendent of the parameters to be estimated. 

(2b) It shall be possible to express the probability density function of the 
sample 0„ as 

P(0 n ) = G{T\ , T 2 , ' ‘ } T f ] Qi\ t CLz j *, Om)f?2(0n , O r f-1 , Qr-1 2 , , O m ), 

where G{ , , • ■ • , ; oi , oj , • • • , a m ) is a functional, depending on a t , ai , ■ • • , a m , 
which in general involves the values of the T< for values of , tfe , • • • , a„ 
different from those appearing in the rest of the identity. (For example, 

0(T, a) = exp f T( 0„ ; a') da'.) 

(3) The r-variate probability density function of T, based on P(0„ ; a t , a 2 , 

• • ■ , a m ) shall exist. 

Definition A. A set of statistical estimation functions Ti which satisfies 
conditions (1) and (2a) will be said to be a sufficient set of estimation functions 
for estimating the parameters 0 , , {i = 1, 2, - - • , r) , relative to the sample 0„ . 

Definition B. A set of statistical estimation functions 7\ which satisfies 
conditions (1) and (2b) will be said to be a 'functionally sufficient set of estima- 
tion functions for estimating the parameters at {i = 1, 2, • • >, r), relative to 
the sample 0„ . 

Definition C. If the conditions (1) and (3) arc satisfied, and the p.d.f. of 
{Ti, Ti, ■ • - , T r ) is independent of the parameters a t , (i = 1, 2, • • •, r), the 
functions Ti will be said to be stable relative to estimation of these parameters. 

Definition D. If the conditions (1) and (3) are met, and E{Ti), (* = 1, 2, 
■ • ■ , r) are independent of the parameters to be estimated, the functions Ti 
will be said to be siable-in-mean; and similarly if modal or median values of Ti 
are independent of these parameters, the estimation functions will be said 
to be stable-in-mode, stable-in-median, etc. 
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It is not difficult to prove that a set of maximum likelihood functions 
L a = 3[log P(0„ ; a, 0)]/da, Lp = d[log P(0„ ; a :, P)]/dp 
under the condition that the second order determinant 


L a * L a p 
Lp* Lpp 


exists and does not vanish over the admissible range of a and ft constitutes a 
set of estimation functions for a and p that are functionally sufficient and stable- 
in-mean under the definition given above. The meeting of Condition (2b) 
is demonstrated by the relation 


log P( 0„ ;a, p) — f L*(a, ft) dot + f Lp(a, p) dp + log P(0„ ; 


<*0 


ft) 


since the first two terms on the right depend entirely upon the functions L„ 
and Lp , and the third term on the right becomes independent of a and p, if 
cto and ft are arbitrarily chosen, once for all, in the admissable region R. 

In general the maximum li k elihood functions are not stable estimation func- 
tions, but in many cases by the introduction of suitable factors which appear 
in the variance-covariance matrix (see (5.3) and (5.4)) estimation functions 
may be formed which satisfy Definition C. 


4. Sufficient statistical estimation functions for the distribution of maximum 
values. The probability density function for the sample 0„(x,) drawn from 
a universe of maximum values is 

(4.1) P(0„) = 

where the summation sign used here and hereinafter refers to summation over 
all indices from 1 to n. Let x denote the sample mean, and define a new set 
of variables z, by 

(4.2) z, = e~ az{ , (i = 1, 2, ■■■,»), 

with mean z. Also set 

- „“ au 
Zq — e 

Recognizing that the variables 2z, /z 0 are independently distributed like x 2 
on two degrees of freedom, the probability density function of z is given by 

(4.3) P(z) dz — [l/r(n)]e~' lWlo (nz/zo) n_1 7i dz/z 0 

with mean equal to zo and variance equal to zl/n. 

The mean value of t of the original distribution (2.1) is known to be Euler’s 
constant, which will be denoted by C. Thus 

(4.4) E[ot(x - «)] = C = .5772157. 
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The above considerations point to a set of statistical estimation functions 
defined as follows 


(4.5) 


X = s/n [a(x — u) — C] , 
Y = v / n [z/zfl — 1]. 


The author was not able to determine the explicit bivariate probability density 
function of X and Y, but the moment generating function 0 may be found 
with some degree of facility if the variables z,' are used in (4.1). Using sim- 
plified functions na(x — u) and nz/z 0 , 

(4.6) G{0i , di) = E[e hn ‘‘^- U) e t,nll ’ i ] = (1 - 0 J )’“' 1 " 11 r n (l - e{). 

Clearly x and z are not statistically independent The first and second partial 
derivatives give 

G h { 0, 0) = nC, G t ,{ 0, 0) = n, G» lh { 0, 0) = nir’/fi + n 7 C\ 

(4.7) 

Ge ,«,(0, 0) = n 1 + n, G) l 9,(0, 0) = nC — n. 

Hence the variances of the marginal distributions are 

(4.8) f 2 [na(z — «)] = nif/f), c*(nz/z a ) = n, 
and the covariance is equal to — n. 

Now the marginal distributions rapidly approach normality with increasing 
n. The question arises whether the bivariate distribution approaches normality. 
One way to prove this is as follows : Consider the moment-generating function 
G, of the statistical functions X and Y defined by (4.5) . Following methods 
outlined above, with d 3 = V nOi , 0* = \/nd% , it is not difficult to show that 
the logarithm of the moment generating function G,(B, , 0 t ) is given by 

logGs 

= (Vn Oi - n) log (1 - 0«/ Vn ) — Vn + n log T(1 — 8,/ \/n) — \/n C. 
As n — > ac , one notes the relations 


-nlog(l — 6i/\/n) — Vnfl* = B\/ 2 + o 

(4.9) n log T(1 - VVn) - VnCd, = (eS/2)(r‘/6) + o 2 (Vn), 

Vn5alog(l — fa/-\/n) = — 0a04 + o,(\/n), 

where o,(V n) denote functions that approach zero as \/n — * <*>, uniformly 
for 8, and in the neighborhood of zero. The limit 


lim log G, = - 28 , + (xV6)9}] 


is recognized as the logarithm of the moment generating function of a normal 
bivariate distribution 
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Thus the bivariate 'probability distribution function of the estimation functions 
X and Y approaches the normal bivariate distribution with zero means and variance * 
covariance matrix 


(4.10) 


ir 2 /6 — 1 

-1 1 


as n increases without limit, and the means and second order moments thus indi- 
cated, hold precisely for all 'values of n. 

The functions X and Y satisfy Condition (1) for sufficiency relative to estima- 
tion of the parameters a and u provided a and u can be expressed as Bingle valued 
functions of X and Y. A condition for this is that the Jacobian of the trans- 
formation shall not vanish. This Jacobian may be reduced to 

[(«.o 2)M5 - (Sa: t e- ai ‘)/(Se" ai *)]. 

Let x, be ordered so that x, £ x, +l . Then for a > 0, the second term consti- 
tutes a weighted mean with positive weights which monotonically decrease as t 
increases, when the inequality x, < x, + i holds. Hence unless all x, are equal, 
this weighted mean is less algebraically than x. Condition (2a) for sufficiency 
is clearly met by these functions. Thus one concludes that for a > 0, and the 
case that not all x, are equal, the estimation functions X and Y constitute a sufficient 
set of estimation functions for the parameters a and u of distribution (2.1), Since 
the moment generating function (see (4.6)) is independent of a and «, these func- 
tions are also stable estimation functions 


6. Maximum likelihood estimation functions. General theory points to 
the use of the method of maximum likelihood as giving the most efficient solution 
(see [5]) With 

(5.1) f(x) = e““ c ‘“ u> 

the maximum likelihood estimation functions are 


(5.2) 


L u — —na{z/zt> — 1) 

L a = n[l/ct — (x — u) + d(z/z 0 )/da] 


with variance-covariance matrix 

na n( 1 — C ) 

(5.3) 

n(l - C) {n/a) [tt 2 / 6 + (1 - Q 1 ] 

Thus with 


X — \ / n($/zo — 1], Y = V n [a(u — ze “ U (u 4- l a /z)) — (ai — 1 )]/B 

B = VtV 6 + (i - cy, 
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where 

z a = d[2e'~ ax< /n]/da, 

the bivariate distribution of X and Y rapidly approaches normality with in- 
creasing n, with zero means, unit variances, and correlation coefficient given 
by (negative, since sign of L u has been reversed) 

(5.5) r = -(1 - C)/(vV/ 6 + (1 - C) 2 ). 

With non-vanishing Jacobian, X and Y constitute a sufficient set of estimation 
functions for the parameters a and u (see (3.2) above). Furthermore the unit 
variances and correlation value given above are exact for all values of n. By setting 
up the moment generating function it is not difficult to show that these functions 
are also stable estimation functions for all values of n. 

The theory of maximum likelihood further shows that if il and a are defined 
as the u and a so\utions of the equations 

(5.6) L„ = 0, L a = 0 

the distribution of \/n (it — u ) and Vn (a - a) will approach normality asymp- 
totically with zero means and variance-covariance matrix which is the reciprocal 
of the above matrix (multiplied by n) ; namely, 

(l/a 2 )[l + (1 - Q 70r76)] - (1 - CV(f 2 /6) 

-(1 - CVO r*/6) W/%) 

6. Numerical applications. As an illustration of the application of the 
methods outlined above for determining the parameters of the distribution of 
maximum values from an observed sample, data is taken from the 57 year 
record of annual maximum flood flows previously used as an illustration by the 
author ([6] p. 324). There is some evidence to indicate that such a series 
follows approximately the distribution of maximum values. At any rate the 
series serves pretty well as a numerical illustration. 

Confidence regions for u and a can be determined by four methods based 
upon the preceding theory. In order to make the numerical illustration more 
cogent, we shall answer the following question by each of the methods. What 
is the confidence interval (with confidence level .95) for annual flood x correspond- 
ing to a cumulated frequency of .99 (often referred to as a 100 yr. flood) based 
upon our observed 57 yr. sample, under the assumption that the distribution 
of maximum values (2.1) applies to this data? 

Method 1. (Based on estimation functions of section 4.) In this case the 
statistical estimation functions Xi and Yx defined from (4.5) by Xi = X \/6 /v, 
Yx = Y, are used. The “best values” of u and a are taken as the solutions 
of Xi = 0, Fi = 0, found by trial and error. As a starting point values of u 
and a may be estimated from X\ = 0 and the standard deviation of x, (see 
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[2] or [6]) , the mean deviation of x, , or an adjusted modal value (see [3]) . A 
few trials gives 

H = 179.7, a = 01998. 

Approximating the distribution function of Xi and F x by the limiting normal 
bivariate distribution (4.10), with confidence level of .95 the equation of the 
bounding constant probability ellipse is found to be 

(6 1) X\ + (1.5594)XiYi + Y\ = 2.3491 

where the constants are independent of the sample values. This ellipse, by 
virtue of the one-to-one correspondence between (Xi , Yi) and (u, a) bounds 
u and a based upon the observed sample (see [4]). 

For cumulated frequency .99, the distribution of maximum values (2.1) 
yields 

t = a{x — 'll) = 4.60015 

Thus the analytic problem is that of determining the maximum and minimum 
value of 

(6.2) x = g{u, a) = 4.600l5/« + u 
which occurs on the ellipse (6.1). 1 

The writer solved this graphically. It was found necessary to compute 
three values of z , — at a — .01, .015 and .025, in addition to the value of z at 
a. = .01998 previously found. From these computations the curves a = .01, 
a = 015, a = .01998 and a = .025 were drawn on the chart of the ellipse (6.1). 
The u = const, curves were quite easily determined by points on the a = const, 
curves found from their Xi coordinates which are linear functions of u (see (4.5)). 
The extreme values of g{u, a) will be found to occur near the extreme values of a 
on the ellipse. A construction of several u = const, curves near these extremes 
enables one to determine several successive values of g(u, a) at points where 
these curves cross the ellipse. The answers were 

Max. g{u, a ) = 507.4 at u = 192, a = .01459, 

(6.3) Mm. g{u, a) — 360.0 at u = 172, a = .02447, 

and g( ft, a) = 409.9. 

Method 2. (Based on maximum likelihood statistical estimation functions 

(5.4) ). For purposes of comparison the writer carried through the solution 
using the maximum likelihood estimation functions Xi and F 2 defined by (5.4). 

1 Since with non-vamshing Jacobian of (Xi , Y 1 ) relative to ( u , a), no singular point of 
the (u, a) coordinate system can lie within the ellipse, it is clear from the form of the func- 
tion g(u, a) that its maximum and minimum values will lie on the boundary of the ellipse. 
A similar remark applies to Methods 2-4. 
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In this case the equation of the bounding ellipse was 

(6.4) X\ + (.62614)X a Fj +- Yl = 5.4042. 

The determination of the network of a = const., u — const, curves was much 
more complicated in this case. The results were 

Solution of X = 0, Y = 0, gave <L = 180.6, a = .01924; g(H, a) = 419.7 

Max. g{u, a) = 509.5 at u = 187, a = .01426 

(6-5) . , v 

Min. g(u, a) = 364.4 at u = 172, a = .02391. 

The slightly smaller range of estimate of g(u, a) resulting from the U3e of 
the second method was forecast from the general theory which predicts a narrow- 
ing of range of variation of u and a for same confidence level. Both bivariate 
distributions involve exact moments of the first and second degree for finite n, 
and both approach normality rapidly with increasing n. Hence comparable 
results were to be expected. Of course the form of the function g(u, a) in relation 
to the different types of estimation functions used in the two cases might modify 
the comparability of the results. 

Method S. (Based on limiting distribution of maximum likelihood statistics 
& and a, with variances unknown.) The use of the limiting distribution of 
the estimation functions Vn (d — u), y/n{d — a) led to results which were 
not entirely expected by the author. Taking 

X, = A<x({l - u)/B, F, = A (A/a - 1) 

(6.6) A = t Vn/V 6, B = vV/6 + (1 - C)\ 

with 

r = -( 1 - O/B, 


the equation of the bounding ellipse is the same as (6.4), (no reversal of sign of 
r occurs because sign of r in (6.4) was reversed by reversing sign of L u in (5.4)). 

Using the inverse method where the range in u and a, with d = 180.6, a = 
.01924, is determined from the range of (X 3 , F 3 ) within the ellipse (6.4), the 
maximum and minimum obtained for g{u, oi) was 


(6.7) 


Max. g{u , a) = 490.2 at u = 193.2, a = .01549 
Min. g(u, a) = 353.8 at at = 174.0, a = .02558. 


This result does not agree closely with the previous results. The reason for 
this discrepancy may be that since the variances indicated by (5.7) are not 
exact for finite n, a variation of a from the central value predicted by (5.6) tends 
to exaggerate the departure of the distribution of X and Y from the limiting 
normal distribution through its effect upon the variances. The plausibility 
of such an explanation is strengthened by the numerical results of a solution 
of our problem by Method 4. 
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Method 4 (Based on limiting distribution of max imum likelihood statistics 
H and a, with variances estimated by taking a = a as observed from the sample.) 
In this case the unknown variances are estimated by taking a = a as observed 
from the sample studied. In order to avoid confusion let at denote this value 
of a as used in the variance formulae. Thus the estimating functions X 4 and 
Yi become 

(6.8) X 4 = Aao(A - u)/B, F 4 = A(a - a) /at 

and the approximating distribution of (X 4 , Yi) is taken as the same limiting 
normal distribution used in Method 3. With 

«o = H = 180.6, at = a = .01924 

the extreme values of g(u, a) on the ellipse were 

Max. g{u, a) = 607.4 at u = 188.6, a = .01443 

(6.9) 

Min. g(u, a) = 362.8 at u = 169.7, a = .02382. 

These results agree closely with the results obtained by Methods 1 and 2. 

The confidence intervals in g(u, a) obtained were, in summary. 

Method 1 360.0 to 507.4 

Method 2 364.4 to 509.6 

Method 3 353.8 to 490.2 

Method 4 362.8 to 507.4. 

From the analysis of the four methods presented above, one might recom- 
mend the following two procedures for finding the confidence interval for x 
m a problem of the above description, as practicable : 

Procedure 1. Use Method 1. 

Procedure 2. Determine the maximum likelihood estimates ft and a from 
(5.6) by trial and error. Then use Method 4. Presumably the second procedure 
would be more open to question, especially for small values of n. 
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ON FUNCTIONS OF SEQUENCES OF INDEPENDENT CHANCE VECTORS 
WITH APPLICATIONS TO THE PROBLEM OF THE 
“RANDOM WALK” IN ft DIMENSIONS 

By D. Blackwell and M. A. Gibshick 

Howard University and U. S. Department of Agriculture 

1. Summary. Consider a sequence {x;j of independent chance vectors in k 
dimensions with identical distributions, and a sequence of mutually exclusive 
events Si , Si, • • • , such that Si depends only on the first i vectors and 2P(£,) 
= 1. Let (pt he a real or complex function of the first i vectors in the sequence 
satisfying conditions: (1) E(<pi) = 0 and (2) E(<pf | X \ , •'•,!,) = <pi for j > i. 
Let <p — <pi and n = i when S, occurs. A general theorem is proved which gives 
the conditions <pi must satisfy such that E<p = 0. This theorem generalizes 
some of the important results, obtained by Wald for ft = 1. A method is also 
given for obtaining the distribution of <p and n in the problem of the “random 
walk” in ft dimensions for the case in which the components of the vector take 
on a finite number of integral values. 

2. A basic theorem. 

2.1 Let {X,} = | (X„ , X 2 , , • •, X*,)} be a sequence of independent ft- dimen- 
sional chance variables with, identical distributions. Let St , St, S» , ■ • be 
mutually exclusive events such that (1) S t depends only on Xi , X 2 , • • • , X, , 
and (2) 2 P{Si) = 1, Let <pi(Xi , X 2 , • • •, X.) be a sequence of real or complex 
variables satisfying the following two conditions: 

Condition 1 : E(<p,) - 0 for all i. 

Condition 2: E(<pj | Xi , X 2 , • • •, X.) = ipt for all j > i, where E(<pj | Xi, X 2 , 

■ • •, X,) stands for the expected value of </ij under the condition that X 2 , X 2 , 

■ ■ ■ , X, are held constant. 1 Define <p, = <p and n = i when the event Si occurs. 
We shall assume that E(?i) is finite, 

A problem of central importance in sequential theory may be formulated as 
follows: What conditions must <p, satisfy So that E(<p) exists and equals zero? 
We shall piove the following: 

Theorem 2.1. If there exists a function f{x\ , x 2 , ■ • ■ , xf) >0 such that (a) 

1 

E\f(X,)] is finite and (b) | #>< | < 2 /(Xi) when n > i, then E(<p) exists and 
equals zero. 

Before proceeding to the proof, we consider two consequences of this theorem. 

i 

I. Assume that E(X U ) = a, . Let = £ (X T j — a,). It is easily 

)— i 

verified that <p, satisfies conditions 1 and 2. We set f{x i , x% , •■•,»*) = \xr 

1 Chance variables <pi satisfying condition 2 have been extensively studied by P. Levy 
[1] and J L Doob [2], 
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— a r |. Then Theorem 2.1 is applicable and we get E<p = 0. Now <p = W r — 

n 

na r where W, = ^ X„ . Hence we have 

,«i 


( 2 . 11 ) 


E(W r ) = OrE(n). 


The relationship (2 11) has been proved for k = 1 by Wald [3] and subse- 
quently under somewhat more generalized conditions, by one of the authors [4]. 

II. Leth , £ 2 , Ik be any real or complex numbers for which Ee^ r ~ l ‘ rZr< — a 
is finite and | a j > 1. We assume that there exists a positive constant M 
such that 


( 2 . 12 ) 

when n > m. Let 

(2.13) 




< M,r — 1, 2, • • - , k, 


<P » 


= e 2 ' -1 2 '“» lrXri - 1 


so that 


(2.14) <p *= a - n e^-' ,TWr - 1 

where W, is defined as above. It is easy to show that <p, satisfies conditions 
1 and 2. Now, in view of (2.12), when n > 1 


(2.15) 




< | a r'e" 2 * -1 1 Tr l e S ‘“ lT,X, ‘ + 1 < 1 + 


^2^ I T r l 

where r,- is the real part of tj and R = e r_1 is a fixed positive constant. 
Then, letting 


(2.16) 


/(*»,*!, ,**) = 1 + Re*'- ir,Xri 


we may apply Theorem 2.1 and obtain 
(2.17) i?(<T"e S? - l ‘ r ^ r ) = 1 

which is a generalization of the Fundamental Identity proved by Wald [5] 
for the case k = 1. 

Proof of Theorem 2.1. Assume <p, is real. Define chance variables N m 
inductively as follows: N 0 = 0. Assuming No, ■ ■ • , N m defined, define N m+1 ~ 
Nm -f - , X , ■ ■ ■ ) . Also let 7i m = N m N m— 1 and y m — ,+i) 

+ ' 1 ■ + /(A'wj. It can be shown by induction that N m is defined for all m 
with probability one, and that {n m }, [y m \ are sequences of independent chance 
variables with identical distributions. Clearly n 1 = n. 

The Strong Law of Large Numbers asserts that if Z \ , z 2 , • • ■ are independent 

chance variables with identical distribution, then lim — Zm = 

m— *og ftl 

c with probability one if and only if Ezi exists and equals c. 
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(2.18) lim = E[f(X x )] 


It follows that, with probability one 

R~ /(X l) T ■ ' + f(X m ) = 
m 

and 

lim ni+ ' = lim ^ = E(n). 


(2.19) 


m 


n-.» m 


,i/i +■•• + !/» _ yiH Hy-i. 


S^e «i_h_ Lit? = P?. 1 — is a subsequence of 

71-1 + • * * + 7hn N m 

we have with probability one, 


/(*i) + + /W 


m 


( 2 . 20 ) 
so that 
( 2 . 21 ) 


lim 3/1 + - ''- - E[f(Xi)} 

m-*oc ly m 


lim 


Vl + m " +V? = E[/(Xx)]S(n). 


771 


Consequently, E(yi) exists and equals Ef(Xi)E{ri). Since | <p j < yi , #(?>) 
exists. Also using conditions (2) and (b) which were imposed on p, we have 


( <p dp 

11 

M- 

& 

^3 

11 

S* 

to - — 

\ Ja i+...+Sj 

1 1 i- 1 J Bi 1 

(2.22) 

= - / <pi dp 

1 * n>* 

= 2 / <p»dp 

J>< J Sy 


< £ f 1 <P. 1 dp < £ / 2/i dp 

1 >»' <*>i *3/ 


which approaches zero as i —*■ =o . This completes the proof. 

If tpj is a complex valued function, Theorem 2.1 still holds. For writing 
v>j = 9j + i/i i then Condition 2 becomes E{g p -+■ ih p j Xi , ■ • Xj) = g * + ih f 
when p > j. Hence 

(2.23) E{g r | Xi , •••,!,) = g, 
and 

(2.24) F(A,|Xx, ••■,*,) = hi 

when p > j. Since | j, ( < | <pj j and | hj ( < | i^y | and <pj satisfies condition 
(b) we may apply Theorem 2.1 and get 

(2.25) Eg - E(h) = 0. 

Hence E<p ■» 0. 
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3. Applications to the problem of the random walk in k dimensions 9 

3.1. A theorem concerning decision points . Let = {(Xu, X ki ) } 

be a sequence of fc-dimensional chance vectors with identical distributions. We 
assume that (j = 1, 2, • ■ ■ , k), take on a finite number of integral values 
ranging from — rj to m } inclusive, where r ; and mj are positive integers. We 
remark that any distribution can be approximated to any degree of accuracy 
by the distribution of a variate whose values are integral multiples of a constant 
d, which can be taken as the unit of measurement. 

Let P Ul u,...u t be the probability that X x — («i , ui , u k ). We define 

t ■ 

W, » = S X P j and set U t = (Wi,, W k ,) . Then {I7i} represents 

i - 1 

a sequence of points with integral coordinates in a fe-dimensional space Sk = 
j (j/i , yi , • • , i/v ) . Let R be an arbitrary bounded region in Sk . We shall 

assume, without loss of generality, that the origin is an interior point of R. 
We now define a random variable n as the smallest subscript i of the sequence 
( Uj) for which W, is either a boundary point or an exterior point of R. We set 
U n = W = (Wi , Wi, • ■ • , Wkj and designate W as a decision point of R. 
Clearly, the number of decision points is finite. 

The random variables n and W can be interpreted as follows: Consider a 
point Q which at the time t = 0 is at the origin. At successive intervals of 
time t = 1, 2, • ■ the point Q moves with integral components in Sk the direc- 
tion and distance of the motion being determined by chance. The point comes 
to rest as soon as, but not before it either reaches the boundary of R or falls 
outside of R. Let U t be the co-ordinates of the point Q ht time t. Then n 
represents the length of time it takes Q to come to rest, and W represents a 
possible resting point. 8 

We shall be concerned with the problem of finding the probability distribution 
of n and W. These will obviously depend on the shape of the region R, In 
what follows we shall restrict ourselves to the class of regions R which have 
the property that the intersection of any line parallel to the axes with R is an 
open interval. In view of the fact that W has integral coordinates, we can with- 
out any loss of generality, replace this class of regions by an equivalent class 
which are bounded by simple polygonal closed surfaces whose vertices have 
integral coordinates and whose sides are parallel to the planes y, — 0. In the 
subsequent discussion we assume that the regions 12 are of this type. 

Let 

(3.10) l.u.b. [y.] 

(»!.»»■ ■■¥*)«« 


* What follows is a generalization of a method previously employed by one of the authors 
[6] for the case k = 1. 

' That Q will reach a resting point eventunlly can be asserted with probability one. 
See A. Wald [6], Lemma 1. 
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and 

(3.11) -b, = g.l b [{yx ,y 2 , • • •, ito) « 22] 

Vi 

then a, and b, are positive integers. 

We now prove the following: 

Lemma 3.1. For the given sequence of chance vectors (X,-) and the given 
region R, the number of possible decision points N R is given by 

(3.12) N B = ]I (a, + b, + r, + m, - 1) - f[ (a, + b, - 1). 

,~x j~i 

Proof: We shall first prove this theorem for a rectangular region R = R x 
where Ri is defined by — b, < y< < a, , (i = 1,2, • • • , k ) and then generalize 
the proof to any region of the class specified. 

Let Ri be a closed rectangular region defined by — (£>,• + — 1) < y, < 

(a, + ffl, - 1) . Then 22j > Ri . Let S = R 2 — Ri . It is clear that every 
integral point of S is a possible decision point. Moreover, no point exterior 
to Ri is a possible decision point. For assume, for example, that there exists 
a point W = (Wi , W 2 , • • ■ , Wk) which is an exterior point of R 2 . Then at 
least one of its coordinates, say W , , has the property that W j > a, + — 1 

or W j < -(b, + r, - 1). But since — (b, — 1) < Wj.„~ i < a y — 1, it must 
follow that X, n took on a value greater than vij or less than — r t which is con- 
trary to assumption. Now the total number of integral points contained in R v 

k 

is II (a, + b, + Tj mj — 1) and the total number of integral points in Ri 

j-i 

k 

which by assumption are not decision points, is II (fiy + b, — 1). Hence 

j-i 

the Lemma is proved if 12 is a rectangular region. 

Now, let R be any polygonal region of the type specified and let Ri be the 
corresponding rectangular region. Consider two randomly moving points Q 
and Qi , each having coordinates W t at time t. Let the decision points for Q 
be defined in terms of R and the decision points of Qi in terms of Ri , We shall 
prove that the number of decision points for Q and Qi are the same. 

By assumption, every line parallel to the axes intersects 12 in an open interval. 
Moreover Ri □ R. Hence the sum of the areas of the segments which compose 
the boundary of 12 musrt equal the area of the boundary of Ri . The same must 
be true for the total number of integral points on the boundaries of the two 
regions. Thus, the theorem is true for r, = mj = 1, (j = 1, 2, • • •, k). We 
assume that the theorem is true for r } = r) and m, = m, and prove that it must 
hold for = m u + 1 for a fixed but arbitrary u. Now it is obvious that if 
the range of X is increased by unity in the positive direction, the point Q 
can move an extra unit in the positive direction parallel to the y u axes. Thus, 
the total number of additional decision points that Q gams by the unit increase 
in the range of X ut is identical with the total number that Qi gains. This 
proves the theorem. 
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It is clear that the smallest rectangular region which includes all the decision 
points of W i s -R 2 , We now prove the following : 

Theorem 3.1. For any polygonal region R of the class previously specified , 
and for any random sequence (X,j in which X t takes on a finite number of integral 
values , the number of points m ihe rectangular region which are not decision points 

k 

is always equal to n (cr, + bj — 1) where a , + b, are the dimensions of the 

,-i 

rectangular region Ri. 

Proof: This Theorem follows from Lemma 3.1 and the fact that the total 

h 

number of integral points in Ri is n (a-, bj + r, + m j — 1). 

3-1 

3.2. The distribution of W . Let , 1 tk) be the joint generating function 
of X U i , (u — 1, 2, • • • , fc) , and ip{h , • • •, t k ) the joint generating function of 
W, (j = 1, 2, •••, k). Then 

mi mk 

(3.21) - £ ••• £ 

r 1 r k 

1 aic +mk — 1 

(3.22) 4>(t u ■■■ > t k ) = £ ■■■ £ in 

f,— (k,4-ri— 1) vl (bjt+ri-l) 

where .., t is the probability that W — (v i, • ••, v k ). In terms of the gen- 
erating function yf/ the Fundamental Identity (3.17) states that 

(3.23) EtT 1 ■ ■ ■ <T* [*(4 , 4)P = 1 

for all h , • • • , tk for which | \p(h , • • • , 4) | > 1. Hence, it follows that for 

h, ■ ■ t k for which ^(4 , • • • ,4 ) = 1, <p{k , • • •, 4) = 1. Let 

(3.24) /(<!,’••, 4) = t\ l ■ • t? [*(4 , • ■ , 4) -1] 
and 

(3.25) g(t i , • • , 4) = < b i 1+ri_1 • • • 4 i+r ‘" 1 fo>(4 , ■ • • , 4) - 1]. 

Then /(4 , ■ ■ • , 4) is a polynomial of degree r ; - + in 4 and g(<i , • • ■ , 4) is 

a polynomial of degree (o ; b, + r, -f- m 3 — 2) in t , . 

We shall assume that/(4 , - •, 4) is an irreducible polynomial. Then, since 
g ( 4 , ■ ■ • , 4) vanishes for all values of 4 , • • • , 4 for which /(4 , • ■ , 4) vanishes, 
it follows 4 that / is a factor of g. That is 

o i-f-i i—l afcH-bit — 1 

(3.26) g(t lf ••• ,4) = /(4, ••• ,4) £ £ C., • • • fj* 

«l«0 ajj—0 

where the are unknown. Equating coefficients on both sides of 

(3.26) we get 


See, for example, B6cher [7], Theorem 7, Chapter 16. 
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-bi-rj+1 


(3.27) 


vi vk 

£ ■ ■ ■ 2 
u i*mO m — 0 


*"/»■/) «v-rt + 

k 

jLl *»y.»/+rj-l 


where 5,y is the Kronecker delta. But by Theorem 3.1, m- (ay + by — 1) 
of the H in p(fi , ••■,<*) are zero since they correspond to values of W 
which are non-decision points. Hence Hy_i (ay + by — 1) terms in (3.27) 
are zero with the exception of the term & 1+ri _i... tH . r „_i (corresponding to the 
non-decision point (0, 0)) which is —1. Hence, we have the required number 
of equations to solve for the unknown C's and consequently for the £’s provided 
the determinant of the coefficients is different from zero. 

As an illustration, let R — Rj , then the C ’ a are obtained by solving the set 
of linear equations 


(3.28) i: • • • £ (n*.„, - P-.-n. < 
“l-o “*-» v-» / 


'»i— r, •»*— r* 


— II S.y.ly+ry-1 


where t)y takes on all integral values from rytoay + by -f- r, — 2 inclusive. 

3.3. The distribution of n. For any random variable U, let E,,... Vi U stand 
for the expected value of U under the restriction that W = (vt , v 2 , • • •, v t ). 
Let <p\ (h , ■ ■ • , f* ; t) be the joint generating function of W\, W 2 , • • • , Wk , 
and n. Then 


(3.31) ¥u(<i , ■ • • , r) = ]£• • -2 £«, ‘ik i E Ul ... ut r n . 

Ui «* 

Let 


(3.32) iAi(£i, ■ • ■ , tt\ r) = rf/(k , k , ■ * • , tk) — 1 

where iKfi , — , 0 is the joint generating function of X u , • • •, Xu and is given 
by (3.21) and let 


(3-33) ^ 2 (fj , t) = v>i(ti , tk t) — 1. 

Then, if we fix t so that | r j < 1, we see by (3.23) that for all values of U , • • ■ , tk 
for which ipi vanishes, f } also vanishes. Let 

(3-34) A(fi , • • •, tk ; t) = t[ l • • • , • • ■, tk ; r) 

and 

\ 

(3.35) /*(ti , “ ' , t*; r) = • • • f* k+T ‘ -1 , ■ • ■ , tk; r). 

Then for t fixed, /i is a polynomial of degree ry + m, in <y and/i is a polynomial 
of degree a, + by -j- ry + my — 2 in <y. Since /i vanishes for all values of 
t\, • tk for which /i vanishes then if /i is irreducible, /i will be a factor of f 3 . 
That is ft can then be written as 
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H1+&1- 2 Ok+ik — 2 

(3.36) /a(«i »-**,<*; r) = /i(/i, • • • ;r) £ 

*!-! m-i 

The rest of the argument is identical with that employed in section 3.3. The 
unknowns in the present case, however, are E Vy ., n r n . When (t r .., k E tl ... vt T n 

is expanded in a power series in r, the coefficient of r M is the probability that 
W = {vi , ■ ■ ■, t>*) in exactly m steps. We shall, therefore, examine the validity 
of the expansion of the above function in the neighborhood of r = 0. 

Let us first consider the rectangular region R = ft . In this case the d’s, 
are obtained from the equations 

•ID*/* \ * 

(3.37) £ •'• £ d.i-n -n-f* “ Ilir/.lf+ry-l, 

uj-1 / J-l 

(v, = rj, r, + 1 , • • • , Qj + b) + rj - 2 ), 

so that will be given as a ratio of two polynomials in t the 

denominator of which will be the determinant of the coefficients of (3.37). 
But this determinant equals unity when r = 0. Hence the validity of the 
expansion is established for a rectangular region. 

If R is not a rectangle, the value of the determinant of the equations in d 
will still be unity. This follows from the fact that the number of non-decision 
points in R 2 is precisely the same as the number of non-decision points con- 
tained in Ri , hence by rearranging of the equations they an be made to assume 
the form (3.37). 
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APPROXIMATION OF THE DISTRIBUTION OF THE PRODUCT OF 
BETA VARIABLES BY A SINGLE BETA VARIABLE 

By John W. Tukey and S. S. Wilks 


Princeton University 

1. Introduction. In an article published elsewhere in the present issue of the 
Annals of Mathematical Statistics [1] the g-th moments of two statistical test 
criteria L mvt and L vc were found to have the following expressions, respectively: 

m tic — TT — l — i) + g) ~[ _ r(|n(A ~ 1)) 

(1) {lc 1} L fm - 1 - o) J 


- 1 ) + g(h~ l)) 


™ (t - 1 ~ {) + g) l r(l(n - 1)(fc ~ *» 

W 1 * i) 11 1_ r ( J(n-l-f)) J r(Kn - l)(fc - 1 ) + g(k - 1 )) ' 

If we denote by (a)„ the expression a (a + l)(a + 2) • • • (a + g — 1) and 
make use of the fact that 


r(o + g) = T(a)‘(a) t 


(4) r(o + rg) = r(a) • (a)„ - T(a) ■ ft (±±± 

where r is a positive integer, the two moments (1) and (2) reduce to 


B(S+£D. 




respectively. 

For any given value of i (i = 1, 2 , • • • , Jfc — 1) the ratio 

(il+Hr- 1 ). „ £iil4=l 

(n . i — l\ (n — l . i — k\ 

1,2 + fc - l/„ \ 2 + k- l)„ 

may be expressed in the form 

r(? t + g) 
r(p< + g x + g) 

which is the g-th moment of a beta variable U{ distributed according to 

r (P. + fr) u *i~ l n — uY^du, 

rwrw ’ ( 
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Each of the moments in (5) is therefore of the form 

ff rfo + g) 

<-i r{p, + g, + g) * 

Thus, L mva and L ve are each distributed like th§ product of k — 1 independent 
beta variables. 

Each of the moments in (5) can be expressed in the general form 


M 0 = 


Mil 

Mi- 


A, + 1 


B t + 1 


where x = - ( or — ) , A , and B, are real numbers. 
n \ n — 1/ 

Other likelihood ratio statistical test criteria which have been discussed in the 
literature have moments which can be expressed in the general form (6). For 
example, the likelihood ratio criterion L\ for testing the homogeneity of sample 
variances [2, Neyman and Pearson 1931] has moments of this type. The gen- 
eralized Li. criterion for samples from a normal multivariate population [3, 
Wilks 1933] has such moments. The criterion for testing sphericity [4, Mauchly 
1940] of a normal multivariate distribution has moments of this kind. All test 
criteria having this type of moment lie on the interval (0, 1). The exact dis- 
tribution functions of the criteria, except possibly for r = 1 or 2 in some cases, 
are very complicated. 

The purpose of this note is to consider a method of finding a fractional power 
of the test criterion which is approximately distributed (in a sense to be described 
later) according to an incomplete beta (Pearson Type I) distribution function, 

(7) 

and to find the appropriate values of p, q, and the exponent of the criterion. 

2. Generalized hypergeometric series as moment generating functions. 

Suppose L is a statistical test criterion, or more generally a random variable 
having as its </-th moment the expression (6). The moment generating function 
<p(t) of L can be expressed as 

nQ-A, + i) 

(8) v(t) = EM,f= f 5 - t Q ■ 

- sG-* +i ). 

This can be written as 


<p(t) = r'+l F r > 


1 , - — Ai, • • •, - — A r < ; t 
x x 

- B r - 

X X 
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where r -+i F r > [ ] is a generalized hypergeometric series [5, Bailey 1935], We 
shall not make explicit use of this fact; instead, we shall work with the coefficient 
of t e in the series, i.e., M„. 

Let us consider 

(10) In M, = - Ai + l) - £)ln(- - B { + 1 ) . 

i - 1 \X ft »-l \X } a 

To expand this in a power series in x consider a single term 

ta (i - a + ■). ■ ~ a+i ) 

(ID 

= —q\d.x + pin (1 - Ax) + X)ln(l + — ^L-V 

j-i \ 1 — Ax ) 

Now 


1 + 


- 


l — Ax 


= 1 + jx + j Ax +]A x + 


Writing 


s n (g) = £r, 

1-1 

and using the usual expansion for In (1 + x), we find 

InQ - A + l) = -glnx + [S,(g) - Ag]x + {U* + AS 1 (g) - lS 2 (g)]x* 

+ [-M 3 + - ASM) + Wd )]* 3 + 

Applying this expansion to the separate terms in (10) and writing 

(12) £»= EA? - £lC 

*-i i-i 

the terms not involving A x or cancel out leaving 
In M, = (~C\g)x + [\C t + CMtf 

+ [—■ 1C, + CA(g) - CiS 2 (g)]x 3 + ■ - . 
We shall return to this expression later. 

3. Powers of a beta variable. If u has (7) as its distribution function, then 


(14) 


JS(«*) = 


- (?)* 


(p + q)h ' 

If r = u T , r integral, then its j-th moment is given by setting h = rg in (14). 
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We have 


W) 


(P)r 


(p + S)ro ' 


But 


so that 


(15) 


n(p+Ani) 

E(v°) = l /» 


r /» 

which is a special case of (6) when p is of order n. 

Putting - = E-lt-? , A, — 1 4- (q — i + 1 )/r, and £?, = ! — (» — l)/r 


we have 


Ci = q, 


Cl - t + s (i + i). 


For any given moment of the form (6), from which x, Ci , and C 2 can be com- 
puted, we determine p, q, and r so as to satisfy 


(16) 


P + g = 1 
r x 


q = Ci 

and to satisfy, as nearly as possible, (with r integral) 


(17) 


i.e., 


I +? ( 1 + 0 “ 


r = 


q(q + l) 

C 2 - q 


The use of fractional r is obviously suggested, but its value and validity are 
not discussed here. Using the values of p, q and r thus obtained, the distribution 
of the criterion L (having moments (7)), is given approximately by 


(18) 


(Vi r‘a - vzr'd(i/D 
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where the approximation is such that all moments are correct through terms of 
order f — ; — ] (when moments are expanded in series of — : — ) and nearly 

\p + g/ v + q 

(exactly if there is an integral value of r satisfying (17)) correct through terms of 



4. Examples. Returning to the g- th moment of L m „ c given by the first ex- 
pression in (5) we have 

£=-, r' = k — 1 
n 

fc+3-i n k - i 
At 2 ’ = 1 ’ 

Cl = §4, - Zb, = *(tf + 3k - 6) 

1-1 »-l 

Cl = - £ J?: - i[(* + 2)(k + 3)(2A + 5) - 84] - . 

To determine p, q and r for the fitted distribution of L mvc we set 

P 4* g _ « 

~ 2 

AZ = !(*:* + 3ifc - 6) 


_ _ g(g + 1) 

Ci - q 

and solve for p, q and r. We have the following table of values, p, q and r for 
various values of k (p being calculated by using the rounded values of r) : 
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Thus, by rounding r off to the nearest integer and using this rounded value of r 
in deter min ing p, we have values of p, q and r for each value of k, which, when 
substituted in (18) give us the desired fitted beta distribution for L nvc . For 
k = 3, the fitted distribution is the exact distribution. 

For the g - th moment of L„ e which is given by the second expression in (5), 

2 

it is convenient to expand in powers of : . Hence we have 

71 i. 


n - 1* 
k + 2 — i 


r' — k — 1 


k — i 
k^l 


Ci = i(A ; 2 + k - 4) 

C, = *[(* + !)(* + 2)(2k + 3) - 30] - 


k(2k - 1) 
6 (As - 1) 


To determine p, q and r for fitting the distribution function of L vc we put 

p + q _ n — 1 
~~r 2 

q = + k - 4) 

r = gfa + V 

Ci — q ’ 

We have the following table of values of p, q and r for several values of k: 


(rounded) 



2 

2 

‘ 1 

2 

2 88 

3 


4 

3.71 

4 


6.5 

4.52 

5 

2.5 n - 12 

9.5 

5.32 

5 

2.5n - 15.5 

13 

6.14 

6 

3 n — 20 

17 

6.88 

7 

3.5n — 25 

21.5 

7.82 

8 

4 n - 30.5 

26.5 

15.26 

15 

7.5n - 111.5 

104 


By rounding r off to the nearest integer, and using the rounded value of r in 
determining p, we have values of p, q and r for each value of k which, when sub- 
stituted in (18), give us the desired fitted beta distribution for L vc . For k = 3, 
the fitted distribution is the exact distribution 
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For a given value of k, approximate 5% and 1% points of \/L m „ and y/T^ t 
can therefore be obtained from Thompson’s [6] tables of the Incomplete Beta 
Function by entering the tables wit h n = 2 q, and = 2 p. For example, for 
k = 6 the 5% and 1% points of \ZL mte are obtained by entering Thompson’s 
tables with n = 24, and v t = 5n — 24. 
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SOME FUNDAMENTAL CURVES FOR THE 
SOLUTION OF SAMPLING PROBLEMS 

By Edward C. Molina 
East Orange, N. J. 

1. Summary, In using collateral information in an inverse probability situa- 
tion to estimate a population fraction from a sample fraction it is necessary to 
use some particular form for the a priori probability function. This paper points 
out the advantages of using Kx T { 1 — x)‘ for this purpose. The application 
then involves only the Incomplete Beta Function. 

Graphs of the 10, 25, 50, 75 and 90 per cent points of the Incomplete Beta 
Function are given. They cover a range which includes and extends previous 
tabulations. 

2. Introduction. The engineer, scientist or industrialist is often confronted 
with the following "sampling” problem; 

“The probability, p, of an event happening in a single trial is constant from 

trial to trial, but the numerical value of this constant is unknown. A series 

of n trials is made and the event happens c times, c < n. What light does 

this statistical data shed on the unknown value of p?” 

As a concrete example, suppose that a new type of brakes is proposed for a 
given class of steam locomotives making the run from Buffalo to Detroit 1 
Let each of 30 locomotives be equipped with a set of the new brakes and given a 
trial run Of these, 26 make satisfactory runs, so far as the behavior of the 
brakes is concerned; the remaining four encounter difficulties. Here, the event 
of interest is a satisfactory run, n = 30 and c = 26. What “weight” (confi- 
dence 2 ) may the design engineer assign to the assumption that, say, 25/30 < 
p < 27/30? 

Practical decisions involving such statistical data are usually based on a com- 
bination of the data with “collateral” information. In fact, the applied statis- 
tician is all too familiar with the extreme case where the statistical data are so 
meagre as to provide no information and where a decision must be made vow — 
in these cases the decision is made solely on the basis of the collateral informa- 
tion, and rightly so 

The methods of statistical analysis and presentation developed up to the pres- 
ent have concentrated on the other extreme case, where the statistical data are 
so good that collateral information can be neglected. 

1 This fictitious example convicts the writer of total ignorance of railroad engineering 
Nevertheless, the illustration brings out, in concrete terms, the class of sampling problems 
under consideration 

2 The purely intuitive meaning to be attached to “weight” and “confidence” is the same 
However, the curves presented with this paper are not based on the theory which underlies 
what are known, in statistical literature, as “confidence intervals”. 
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There is a real need for methods of analysis and presentation to be used where 
both the statistical data and the, collateral information should be used. How- 
ever, when the significance of the collateral information is adequately expressed 
by a function w(x), x being a permissible value of the unknown p, the classic 
Bayes-Laplace theory (see [1]) of inverse probability gives the solution to a 
sampling problem. 

The purpose of this paper is to present a set of sampling curves based on a 
w(x) function whose form embodies some important properties. 2 

3. Hardy’s collateral frequency function- Consider again the locomotive 
brakes problem. The new design may have been carefully engineered, in ac- 
cordance with well-known principles, to reduce costs at the expense of a slight 
reduction in reliability of operation. In such a situation, the collateral informa- 
tion would be somewhat as follows: There is a high “probability” that the un- 
known value of p is a little below the known value for the old type of brakes. 
Moreover, it may be assumed that the “probability” drops rapidly for values 
of p departing materially from this old value. Suppose the latter is p — .95; 
then the collateral information would be presented by some such curve as num- 
ber 5 in Figure 1, the mode (peak) of this curve being at .90, which is slightly 
below the old .95 value. 

Number 6, of Figure 1, belongs to the family of curves corresponding to the 
frequency function 


w(x) = Kx’( 1 — x)' 

This form for wj(x) was suggested, in 1889, by the British actuary Sir George 
F. Hardy (see [2]) for the construction of mortality tables. Its mode, mean 
and variance are given by the equations 

Mode = r/(r + «) 

Mean = (r + l)/(r -j- s + 2) 

Variance = (r + l)(g -j- l)/(r + s -f 2 f(r -f s + 3) 

G. J. Lidstone (see [3]) has pointed out that the Hardy form for w(x ) has two 
important advantages: First — “By suitable choice of r and s any required values 
of the mode or mean and the variance of z* can be reproduced, and thus a great 
variety of distributions may be approximately represented.” Lidstone’s 
z* is our w{%). Second — “The factors x r and (1 — x)* unite in the simplest 
and most elegant way with similar factors in the Laplacian integrand . . . ”. 


s Many statisticians, including a referee of this paper, feel that it is a common situation 
to have the collateral information so vague and elusive that it is virtually impossible to 
take it into account via inverse probability. (The author doubts this.) Such statisticians 
may wish to use the Clopper-Pearson confidence intervals, using no collateral information, 
vn which case these curves can be used as indicated by Schefi<5 (“Note on the use of the 
tables of percentage points of the incomplete beta function to calculate small sample 
confidence intervals for a binomial p", Biometrika, August, 1944) 
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From this second advantage there follows a third which will be presented in 
section 6 below. 

4. Theory. The Bayes-Laplacian formula gives us 

(1) P(V £ X) = f*w(x)x c ( 1 - x) n ~‘dx j jf w(®)x c (l - x) n ~‘ dx 
for the “a posteriori probability” that p < X. In this formula, the product 

Fig. 1 

Particular forms of the a priori (collateral information) function; 



Curve 

r 

X 

Form 

i 

0 

0 

K 

2 

i 

i 

Kxl( 1 - x)» 

3 

l 

l 

Kx( 1 — x ) 

4 

2 

l 

Kx l ( 1 — x) 

5 

9 

l 

Kx’ (1 - z) 


x e (l — x)’ 1- " takes care of the fact that the event happened c times in the n 
trials; the factor w(x) represents, quantitatively, the collateral information. 
Adopting, now, Hardy’s frequency function, we assume that 

(2) w(x ) = Kx T (l — x)*, 
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r and s being assigned values in accordance with the collateral information 
pertaining to the particular problem under consideration. Theoretically, the 
constant K should be such that 

/ w{x) dx = 1, 

Jo 

but, since w(x ) enters in both numerator and denominator of (1), any desirable 
value may be given to K. Advantage has been taken of this in constructing 
Figure 1 ; to facilitate comparison of the five curves shown therein, for each 
curve K is such that the maximum ordinate is equal to 1. 

The second advantage, pointed out by Lidstone, of the form adopted in this 
paper for the function w(x) becomes apparent immediately on substitution of 

(2) in (1). We obtain 

(3) P(p < X) - r x c (l - x) N ~° dx / t * c (l - x) N ~ c dx 

with C = c + r and N = n + r + s. Therefore, a single family of fundamental 
curves, plotted with reference to C and N, will give the solutions for a multitude 
of different practical problems. To solve a particular problem, for which the 
values of n, c, r and s are specified, we merely enter the curves with C = c + r 
and N - n -f r + s. These linear relations transform all a posteriori curves, 
published on the assumption that w{x ) is a constant, into fundamental curves; 
namely, that they are applicable with the more general form (2). For example: 
The information given on the sheets of inverse curves (inserted in the back cover 
pocket) of Col. Leslie E. Simon’s Engineer’s Manual of Statistical Methods in- 
cludes the restriction “that prior to sampling, one lot fraction defective is as 
likely as another”. It is now obvious that the use of Col. Simon’s curves is 
not so limited; his curves may be used in any situation wherein the available 
collateral information is covered by the assumption that w{x) has the Hardy 
form. Likewise, the “Weight = ,98” and “Weight = .8” curves (“confidence”, 
in the intuitive sense), presented by R, P Crowell and the writer in their paper 
now have a much wider range of applicability. 


6. Curves. The ratio of definite integrals in equation (3) is tabulated, in a 
different notation, in “Tables of the Incomplete Beta Functions”, edited by 
Karl Pearson. 


This paper 

C 

N - C 
X 

P(v < X) 


Pearson Tables 
p - 1 

q- 1 
x 

tabulated value 


Thompson Tables (see [5]) 
(®* ~ 2)/2 
(wi - 2)/2 
tabulated value 
caption to Table 


The range of values of C and (N — C) covered by the Pearson Tables is indi- 
cated by the shaded area in Figure 7. For curve points falling outside this 
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range (except for C = 1 and 2, found from the binomial summation by trial 
and error) recourse was had to a series developed by the writer for the solution 
of some problems confronting him, as Switching Theory Engmeer, in the Bell 
Telephone Laboratories Many points of the C = 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 12 



1 2 4 6 8 10 20 40 60 BO 100 

■ — - 


Fig. 2 

and 14 curves can be ob tamed directly from the Thompson Tables. They do 
not, however, give any points for the C = 16, 18, 20, 25, 30, 40, 45 and 50 curves. 
It may be added that, except for certain marginal values, the Thompson Tables 
were also derived from the Pearson Tables 
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* .75 

O 



I - 


Fig. 3 


Five sets of fundamental curves are submitted, namely, 


Figure 2, 
“ 3, 

" 4 , 

“ 5, 


P(p < X) = .25, X = Pi 

" = .75, X = p 2 

" = .10, X = Pl 

" = .90, X = p 2 


“ 6, “ =.50, X = p 0 

It will be noted that pi has been written instead of X for the curves such that 
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P (p < X) is less than .50; likewise, p 2 for X for those corresponding to P (p<X) 
greater than 50; p Q for X for the P (p < X) = 50 curves. 



■ — - 


Fig. 4 

For each pair of values of C and N, the curves of Figures 2 and 3 give the range 

P(Pi < P < pi) = .50 

whereas, the curves of Figures 4 and 5 give the range 

P{Px < P < Pi) = .80 

As an example of the applicability of the fundamental curves, let us reconsider 
the locomotive problem for which n = 30 and c = 26 It was suggested that 


332 


EDWARD C. MOLINA 


■ *90 



Fis. 5 


the t 9, s 1 curve of Figure 1 might well represent the collateral information 
available. Therefore we take N = 30 + 9 + 1 = 40 and C = 26 + 9 = 35. 
Entering Figures 2, 3, 4 and 5 with this data we find 


Fig. 

Pip < pO 

Pi 

II 

Fig. 

P(P < Pa) j 

Ti 

2 

.25 

.83 


3 

.75 

.89 

4 

.10 

.79 

1 


5 

.90 

.92 
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Fig, 6 

Thus we have, for the unknown probability of a successful run with a new set 
of brakes, 

.83 < p < ,89, with weight . 50 

and 

.79 < p < .92, with weight .80 

6. Sequential property of the curves. The original draft of this paper was 
submitted to Dr. W . V. Houston 4 in connection with the solution of a problem 

* Of the California Institute of Technology and now President of Rice Ins titute, Hous- 
ton, Texas. It was Dr. Houston who gave the impetus to the publication of this paper. 
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“Tables of The Incomplete Beta-Function," edited by Karl Pearson, can be used for 
evaluation of 


f x°(\ — x) N ~ c dx 

P . JL 

f x°(l — x) N ~ c dx 

Jq 

only when values of ( N — C) and C are in 



Fig. 7 


in which he was interested. Regarding equation (3), Dr, Houston made a very 
significant comment, the burden of which may be stated as follows: Suppose 
that before the series of n trials had been made, it was known that, at some 
earlier time, a series of r + s trials had resulted in r successful outcomes. Sup- 
pose, moreover, that the collateral information called for the assumption that, 
a priori, all values of p were equally likely. Under these circumstances equation 
(3), derived by substitution of (2) in (1), gives P(p < X ) for two consecutive 
series of trials, one of r + s with r successes followed by another of n with c 
successes. An immediate generalization of Dr. Houston’s thought shows that 
the fundamental curves may be entered with 
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N = ni + n 2 + • ■ ■ + n t + ■ ■ ■ + n m + r + s, 

C = ci + C2 + • • • + c< + ■ • • + c m + r, 

for the solution of a problem involving m consecutive series of trials, and c, 
being the number of trials and successes, respectively, in the ith series; the in- 
troduction of r and s removing the restriction that all values of p were a priori 
equally likely. 
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ENLARGEMENT METHODS FOR COMPUTING THE INVERSE MATRIX 

By Louih Guttman 
Cornell University 

1. Summary. The enlargement principle provides techniques for inverting 
any nonsingular matrix by building the inverse upon the inverses of successively 
larger submatrices. The computing routines are relatively easily learned since 
they are repetitive. Three different enlargement routines are outlined: first- 
order, second-order, and geometric. None of the procedures requires much more 
work than is involved in squaring the matrix. 

2. Introduction. A set of methods is presented here for computing the in- 
verse matrix, based on what we shall call an enlargement principle. The princi- 
ple is to build the inverse upon the inverses of successively larger submatrices. 
This leads to simple repetitive routines that are not unlike iterative steps, but 
afford a direct solution. 

The basis for such routines has also been noticed before, 1 * * * 5 * but does not seem to 
have attracted the attention it merits. A possible reason for this lack of atten- 
tion may be the belief that the methods apply only to a restricted class of mat- 
rices. We establish a simple lemma in this paper which shows that the enlarge- 
ment methods apply to all nonsingular matrices, so that their use is perfectly 
general. 

The enlargement principle may be considered an opposite of the “condensa- 
tion” principle that governs Gauss 1 method of elimination and its variants such 
as the Doolittle procedure and Aitken’s “pivotal condensation.” 8 It is interest- 
ing that the same formula upon which the enlargement methods are based can 
also serve as a foundation for the condensation methods, as is shown in section 
7 below. 

The enlargement methods have the following characteristics : 

(1) The first-order procedure outlined in the next section has been learned 
by statistical clerks in about ten minutes. People who calculate inverses only 
occasionally and forget the process between times should find the method as 
economical as those who must constantly compute inverses. 

(2) They are direct methods, and yield an exact answer wjth not much more 
work than is involved in squaring the matrix. 

(3) They can be adapted to electric punch-card systems, which will be effi- 
cient when very large matrices are to be inverted. 

1 It has appeared earlier in [2], Waugh’s recent note [10] also rediscovers the basic for- 
mula although only a specialized use is suggested there. Professor Harold Hotelling has 

called my attention to reference [1], which overlaps substantially with the present paper, 

and to a use of an enlargement approach to computing latent roots and vectors [9]. I am 

also indebted to Professor Hotelling for other helpful comments on the present paper. 

5 For an excellent summary and bibliography of direct and iterative methods for com- 

puting the inverse matrix sea ([5], [6]). 
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(4) A sequence of inverses is yielded. Exact inverses of successively larger 
submatrices are computed in the routines, and these inverses are often them- 
selves of interest. For correlation problems, this means that a sequence of sets 
of successively higher order multiple correlation constants is produced routinely. 

(5) The general formula upon which the methods are based allows many varia- 
tions in procedure, so that special adaptations can be easily made for special 
matrices 

A “first-order” enlargement procedure for computing the inverse matrix will 
be outlined in the next section. The proof for the method follows from the gen- 
eral formula in section 4. This procedure and formula are also described in 
[2]. Other enlargement routines are described in subsequent sections. Some 
additional formulas of relevance are discussed in section 8. 

3. First-order enlargement. Let the matrix whose mverse is desired be 

Oil UlS ‘ 1 ‘ Ulft 
^4 b S3 Oil Oss ■ • • Oja 


II flnl ObS ■ ‘ • Obb 

The following sequence of successively larger principal submatrices will be as- 
sumed to be nonsingular: 




Oil 

a 12 

an 

an 

fll2 

A. 3 — 021 




9 

021 

O23 

a2i 

022 


Oj2 

Os 3 


If necessary, the rows and columns of A„ can always be shifted to obtain such a 
sequence. The following additional notation will be used: 

B t = (oi.i+iaa.i-M ••• ai.i+i) 

C, = {di+1,1 Oi+l,2 — a <+l,i) 

dx = dx+l ,i+l. 

Thus, we can write 

A, B'i 

A<+i = » , (i = 2 , 3 , — 1 ) . 

w d{ 

The first-order enlargement procedure is to compute in turn A7 1 , A7 1 , ■ • • , A7 1 . 
The inverse of A 2 is computed by the traditional steps: 

{1} Compute A = UnOsa — a 2 i«ia , and compute 1/A 
|2} Then 

A *023 — A 1 fli2 

“A 1 fl2i A flu 
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Remember that B 2 = (013023)1 C 2 = (031 032) j mid that d 2 = a 33 The steps 
for computing AJ 1 are as follows : 

{3} Compute E 2 = A7 i B 2 . 

{4} Compute f 2 = d 2 — C 2 E 2 . 

{5] Compute I//2 . 

{6} Compute G 2 = /T 1 -^ , and compute Hi = fi'CiAi 1 . 

{7} To each element in A/ 1 add the product of the corresponding elements 
in Ez and H a to form K 2 = Aj 1 + E 2 H j . 

Then the third order inverse is 


Az 


1 


Ki -G'i 
-Hi I//2 


In general, to obtain A7+1 from AT 1 , (i = 2, 3, • * , n — 1), imitate* steps 
(3) through {7}: 

{3'] Compute E< = A7 l B[ . 

{4'j Compute/, = d; — C t E[ , 

{S' j Compute /7 1 

{6'j Compute G\ = fT'E] , and compute H t = f7 1 C,A7 l . 

(7'} Compute K, = A7 1 + E{H,. Then 


A 


-i 

«+i 


if, ~g; 

-Hi 1 //, 


By repeated applications of steps {3'{ through {7'j to the successively larger 
Al\ A7 1 is attained. 

If A n is symmetric, then almost half the work is saved, for then B, = C i , 
G, = H t , and if, is symmetric, (i — 2, 3, • • • , n — 1). 

To help gauge the amount of work needed to arrive at AT 1 , let us compare it 
with the work that would be needed to square A„ . For the general asymmetric 
case, n a product sums of n terms each are required for A\ , a total of n multipli- 
cations. With calculating machines, the sums of the products are accumulated, 
so that no separate process of addition is involved. To reach AT 1 by the above 
enlargement method, if — n multiplications are required. Most of the addition 
is accomplished in the process by accumulative multiplication, but an additional 
n(n — 1)(2 n — 1) 


6 


+ n — 3 terms have to be added otherwise. Furthermore, 


n — l reciprocal numbers are needed. Thus, A„' involves somewhat less multi- 
plications than does A\ , but needs more additions, as well as some reciprocal 
numbers. 


1 Actually, these steps could be used immediately in place of steps {1} and (2) to com- 
pute A f 1 , by letting i = 1, and letting A , = a,, (which may be assumed different from zero). 
The traditional method, however, is quicker for the 2x2 matrix. 
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In linear multiple correlation problems, if A 1+i is the correlation matrix of the 
first i + 1 variates, then E t consists of the regression coefficients of the first i 
variates for predicting the (i + l)th variate, and /, is the square of the multiple 
correlation coefficient for this regression. 


4. A lemma and the general formula. The enlargement procedure just out- 
lined is one of many possible routines which can be developed from a general 
formula for the inverse matrix in partitioned form, This formula seems to have 
appeared first in [2], where it is stated that the method applies only to the cases 
where/, 5 ^ 0 in step {4} We shall establish here a lemma that shows that this 
is no restriction , for the submatrix m step ( 4 } is always nonsingular Our lemma 
proves that the enlargement methods will invert any nonsingular matrix. 

Let A„ be a nonsingular matrix of order n, partitioned in the form 


(1) 


A 

B' 

C 

D 


where A is of order m, (1 < m < n), and will be assumed nonsingular B and 
C are of n — m rows and m columns, and D is of order n — m. 

The following lemma is needed to show that enlargement methods will invert 
any nonsingular matrix: 

Lemma If m ( 1 ), both A n and A are nonsi?igular, then the matrix 


( 2 ) 


F = D - CA~ l B' 


is non singular. 

For the proof, postmultiply the first submatric column of A„ by A~ X B' and 
subtract from the second, leaving 


A 

0 

C 

F 


M differs from A n only by an elementary transformation; hence its rank is that 
of A n But clearly the rank of M is the sum of the ranks of A and F. There- 
fore, the rank of F is n — m, and F is nonsingular. 

The inversion formula itself is the following identity: 


A B' 

—1 

A -1 + A -1 B'JT 1 CA~ l -A -1 B'F* 1 

C D 


-r'CA' 1 f- 1 


A direct verification that the identity holds can be obtained by multiplying the 
right member in either direction by the right member of (1), yielding the unit 
matrix. 

In section 3, the formula exhibited for A 7+1 at step {T} is easily identified 
as a special case of formula (3) where n = i + l,m = i. F corresponds to /, , 
which is a scalar number; hence F~ l is easily computed in this case. 
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6. Second order enlargement In formula (3), once 4 _1 is given, the rest of 
the work is essentially straightforward matric multiplication, except for com- 
puting F~\ In section 3, F was easily inverted since it was of order unity. F 
Can also be easily inverted if it is of order two, so that a second order enlargement 
procedure is feasible, computing -47+2 from 47 1 . The steps are similar to those 
in section 3 but involve larger matrices. 

Letting 4i have the same meaning as in section 3, define now , C,- , and 
according to the partitioning 

II A t B\ || 


Then B< and C, are of two rows and i columns, and D, is of order two. Compute 
-47 1 as in section 3. From then on, to compute 47+2 from 4, , the steps are*. 
(3"} Compute B'i = 47*5, . 

(4 // ) Compute F { = Di — CiE\ . 

{5"} Compute F7 1 by steps [1] and (2] of section 3. 

{6") Compute G\ = Fi x E \ , and compute Ih = F7 1 6 , v 47 1 . 

17") Compute K t = 47 1 + B\H X . 

Then 


4.-+1 


Ki —G { 

-H t F 7 1 


If n is even, successive enlargements will lead 47*. If n is odd, then 4kii is 
attained, from which 47 l can be computed according to section 3. 

The number of multiplications and additions for this procedure is the Bame as 
for section 2 However, less writing is involved since only about half as many 
4, are inverted. A disadvantage is that it is more complicated at each stage 
than is the procedure of section 3. 


6. Geometric enlargement. Another routine is that which may be called 
geometric enlargement. Here, 47/ is computed from 47 1 . The Bteps may be 
described as follows. Letting 4, have the same meaning as previously, redefine 
Bi , C, , and D ( according to the partitioning 

|| ^ B'i 

An — 

||C, D, 

Then JB, , C, , and D, are all, like 4, , square matrices of order i. Compute 47 1 
according to steps ( 1 j and { 2 j , and compute 47 1 according to steps { 3" ) through 
(7" } . From then on, to compute 47. 1 from 47\ the steps are formally the same 
as before, with a complication in step { 5"' } : 
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{3'"} Compute E[ = A^B [ . 

(4"'} Compute F, = D t — C X E\ . 

15"') Compute F7 1 by geometric enlargement in the same way as AT\ 
{6'"} Compute G[ = F~ x l E[ , and compute H { = FT^.A^ 1 . 

{7'"} Compute K, = AT 1 + E[Hi . 

Then, 

li & -<?J| 


This method involves less writing than the others, but is more complicated. 


7. Condensation methods; special cases. Formula (3) also affords a basis for 
condensation methods by "back solution.” For example, let A be of order m, 
where m is one or two so that A is easily inverted. Then F is of order n — m, 
and we will denote it by Fn-m . Partition F n _„ into the form 


F = 

* n— m 


A i» 

B' m 

c m 

D w 


where A® is again of order m, defining F„^ of order n — 2m, Continue the 
process until an F , is reached which is easily inverted, and solve backwards to reach 
F n — m, and then An, by repeated use of (3). 

Formula (3) is of great help in those special cases where A is large but easily 
inverted, such as a diagonal matrix, orthogonal matrix, etc. The labor can then 
be focussed on inverting an F which is much smaller than A n . 


8. Further identities. It is of some interest to exhibit some matric identities 
relevant to formula (3). Using the notation of section 4, let us seek the inverse of 
A„ partitioned in the form 


(4) An 1 

An equation to be satisfied is 


W X' 
Y Z 


W X' 


A B' 


I 0 

Y Z 


C D 


0 I 


which yields the equations 


(5) 

WA + X'C = 1 

(6) 

WB' + X'D = 0 

(7) 

YA+ZC =* 0 

(8) 

YB' + ZD = I. 
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If A and D are nonsingular, then from (6) and (7), 

(9) X' = -WB'D-\ Y = -ZCA-K 

Using (9) in (5) and (8), and remembering the lemma of section 4, we obtain 

(10) W = (A — B'D-'C)- 1 , Z = (D - CA- l B')-K 
Using (10) in (9) yields 

(11) X' = -{A - B'D-'O-'B'D- 1 , Y - -{D - 
Putting (10) and (11) into (4) completes the formula 

(A - B'D _1 U) _I — (A — B'D~ X C) -1 JB'ZT 1 

-(D ~ CA-'BT'CA- 1 (D-CA-'BY 1 
Comparing (3) with (12), we have the identities 

(13) (A - B'D~ l C)-' = A" 1 + A-‘B'(D - CA-IB') -1 ^ -1 

(14) (A - B' D~ l C)- l B' D~ l = A-'.8'(Z) - CA-IB') -1 , 

which may of course be verified by direct simplification 
An important feature of each of these identities is that the matrix in parentheses 
on the left is of order m, while that in parentheses on the right is of order n — m. 

A special case of (13) was noticed by the writer [3], [4] and of (14) by Leder- 
mann ([7], [8]) and the writer ([3], [4]), in connection with regression problems 
of factor analysis In this special case, A is a diagonal matrix and hence easily 
inverted , n — m is the number of common factors, which is usually small com- 
pared with m, the correlation matrix of m observed variates is given factored 
into the form A — B'B~ l C ; and the work of inverting the correlation matrix of 
order m is simplified essentially into inverting a much smaller matrix. 

It should be noticed that (12), (13), and (14) assume that both A and D are 
nonsingular, where (3) assumes only that A is nonsingular (since then F must be 
nonsingular from the lemma of section 4) 
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THE FREQUENCY DISTRIBUTION OF DEVIATES FROM MEANS AND 
REGRESSION LINES IN SAMPLES FROM A MULTIVARIATE 
NORMAL POPULATION 

By D. J. Finnby 
Oxford University, England 


1. Summary, The joint frequency distribution has been found for any set 
of the ( n — k) deviates from their sample mean of each of the t variates in a sam- 
ple from a multivariate normal population. Expressions for the variance of any 
single deviate in this distribution, the correlation coefficient between any pair 
of deviates, and certain partial correlation coefficients between any pair have also 
been obtained. 

These results have been generalized so as to include the corresponding proper- 
ties of deviates from a set of t multiple linear regression equations estimated 
from the sample, the m independent variates being the same for each of the t 
dependent. 


2. Introduction. Some years ago, Irwin published results relating to the fre- 
quency distribution of the deviations of individual observations from the mean 
of a sample drawn from a normal population (see [1]) . He derived an expression 
for the joint distribution of any number of these deviates, which distribution 
is always of the normal multivariate form, and thence obtained the total and 
partial correlation coefficients between any pair of the deviates. 

The purpose of this paper is to discuss the generalization of Irwin’s problem, 
firstly to the properties of the deviates of individual observations from the mean 
in a sample from a multivariate normal population and secondly to the properties 
of deviates from a regression equation instead of from a mean . So far as is known 
to the writer Irwin’s results are of little practical importance, and these generali- 
zations are probably of no practical value whatsoever. Nevertheless, they have 
some interest as additions to the knowledge of the mathematical properties of the 
normal frequency function, and for that reason alone they are put on record here. 


3. Deviations from the sample mean. Irwin based his discussion on a normal 
population with mean m and variance a , but the algebra is simplified a little, 
without any real loss of generality in the final results, if, by means of a prelimi- 
nary transformation, these parameters of position and scale are made zero and 
unity respectively. The multivariate normal distribution in the t variates ,y, 
(i = 1,2, ■■■ f), each with mean zero and variance unity, has the frequency 
function 


( 1 ) 


(2x r# 


It'S exp < - 


2 R 


o’iV iV 


> 
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where i, j = 1, 2 ,' ■ ■ ■ t; is the cofactor of the element p t] in the determinant 
of population correlation coefficients 


( 2 ) 


R = 


1 Pl 2 Pll ’ 
P 12 1 Pa ' 


Pu 

Psi 


Pll P2i p3t 1 


A summation convention for the affixes i, j is understood throughout this paper, 
except when the contrary is explicitly stated. 

Let ((y p ) represent a sample of n independent sets of values of the t variates 
randomly selected from the population, (p = 1, 2 ,••■,«). Then the element 
of probability for the sample is 

(3) ( 2 “)}lu^Tn e *P j — ^ P° £ • 2/p i2/pj < II ( [d(>y P ) } • 


If is the mean of the n sample values of ,y, the deviates from the mean are 
(tY p ), where 



the summation being taken over q = 1, 2 , • • • n with 


S 


VI — 


1 if 
0 if 


V = 3 
p *q. 


Now the , Y are linear combinations of normally distributed variates, and are 
therefore themselves normally distributed. Clearly 


(5) E(<Y P ) = 0 

and, from an expansion by means of equation (4) using 


E{(y P iVq) — p,j , 

(6) / A 

EUY P} Y,) = (s pq - ijp„, 

wherep,, = I (not summed). Consequently the variance of any one deviate is 


(7) 


= 


71—1 


n 


» 


and the correlation coefficient between any pair is 

(8) p(.r„,r«) = n ^_~ 1 1 Po- 

Equation (7) and equation (8) for the particular case of i = j agree with the 
well-known results that Irwin has already given as equations (10) of his paper. 
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For any i, only (n — 1) of the deviates . F p are functionally independent. The 
joint distribution of these for p = 1,2, ■•■,(« — &) may be obtained from an 
inversion of the matrix of correlation coefficients. If A is the determinant of 
this matrix and A( { F P , } Y q ) the cofactor corresponding to the two elements 
specified, this inversion shows that 


w ^r L2 -( s » + E )f'V- 

The joint distribution is therefore 

(10) const. X exp p’ t + 0 .F P3 F„j IftdF). 

Now A may be evaluated as 


and the constant multiplier in equation (10) is therefore 


( 11 ) 



{ (2k) 1 R ]»<«-»• 


From equation (9), the partial correlation coefficient between any two of the 
variates in the distribution (10), the remaining t{n — fc) — 2 being held constant, 
is written down as 


(12) partial correlation coefficient between ,F P , 3 F„ 


fcSpa -f- 1 p { ’ 

k + 1 (pV)* : 


the summation convention is suspended for this equation. 


4. Deviations from regression equations. The results obtained in section 
three may be generalized so as to relate to the frequency distribution of deviates 
from linear or polynomial regression equations instead of to deviates from means. 
Suppose that there are m independent variates x a , (a = 1, 2 , • • ■ , rri), which 
take values 3 P corresponding to the sample observations ,y p ; polynomial re- 
gressions may be included by taking powers of an x as separate variates. If a 
conventional variate x a , whose value is always unity, be introduced, the regres- 
sion equation of t y on x a , {a = 0, 1, 2 ■ • • , m), may be written 

(13) iV = 0>V, 

where a summation convention is understood for « = 0, 1 , • • • , m and the 
regression coefficients are the solutions of the normal equations. 

,b“ Yj 

3 > V 


( 14 ) 



Write 

(15) 
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B al> = £ 

P 

and let (J5 aj s) be the inverse matrix of (B a ^) . 

Then the solutions of equations (14) are 

(16) ,5“ = ^EvJ/,4- 

P 

If the deviation of x y p from the regression equation (13) is , then 

iZp — tUp 

— (8 pq — B a pXp3%) il/q , 

« 

the summation for q being over q = 1, 2 , • • • , n. As for equation (5), 
(18) E(,Z P ) = 0. 

Also 


(19) E^ZpjZq) = (Spq - B a/3 x;4)p„. 

since by definition 

B afi B ar = W 


Write now 0 for the squaie matrix of ( m + 1) rows and columns whose elements 
are the B af , and X p for the single column matrix of values x corresponding to 
the pth observation; i.e. 

(20) $ = (B“ s ) 


and 


(21) 



Write also 


(22) Bp, q , Tl . = 6 - XpX'p - XX - XX 

Then 

I 0 P \ = I fl|-(l -Baextfp), 

and 

| Qpq | — | 8pq + = — | 0 | -BafiXp^V 
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Hence, from equation (19), the variance of a deviate may be written 
(23) <r 2 (.Z P ) = !M, 

and the correlation coefficient between any pair of deviates is 

fp<i (p = g) 

! 24 > »(.z„ A) - j i, |_|« „+x,t,\ 

r ircnw _ {v * q) 

For any i, only (n — m — 1) of the deviates \L V are functionally independent. 
The joint distribution of these for p = 1, 2 , • • ■ , (n — k) and any k > m + 1 
may be found by inversion of the matrix of correlation coefficients obtained from 
equation (24) . The multiplier of the exponential in this distribution of t(n — k) 
variables is 

| 0|U(n-k) 

(2ir) T “" £>*' ’ 

where 


D = 


1*1 

N ~ |0n + fSTi-XTal 

|&l.n— Jt| 

- 1 0 lin _* + XX-*| 

|fc| — |0I2 + X 1 X 2 I 

1*1 

j 02 , 7 !— fc| 

— |02,t.-* ■+• X 2 Xn_*| 

| 01 .n-fc| — |0l,n-lfc+.XlZ n _*| 

1 62 ,n-fc| — | ft ,n-k + XlX n -k | 

... 

1 e n . h | 


Since 8 is positive definite, there exists a non-singular matrix K such that 

K9K' = I. 


Then the X r may be transformed to new column matrices W P by 


and consequently 
It follows that 


KX P = W, = 


i 

Wp 

wl 


w , 


Xp = KT'Wp 


!M = \6V\I-WpWp\, 
which may be reduced to the form 



DISTRIBUTION OF DEVIATES 


349 


Similarly 


1 &PQ 1 

| e pi + x p x' 1 = -1 

9 | w a P w° . 

Hence 


1 — WiW% 

a <1 

—wiw 2 

-wtwl-k 

D = | 6 1""* 

—W 1 W 2 

1 — wlwz • • • 

— W^Wn-k 


—WiV)l-k 

— wZw a n -k 

1 “ W n — kWn— k 


This may be transformed into 



d = | e r* 


In-k 


I Wl W2 • * * W n -k 

= | e r*. 1 1 - WxW[ - WtWi - 
= I 0 |" ‘ *• | 01,2, |. 


Wn-k 
Im + 1 

• -Fn-*JFn-* | 


Thus, finally, the constant in the distribution is found to be 

_L__ /L£L\“ 


(25) 


in which Q h has been written for 0i, 2 , , a matrix of the same form as 0 

but calculated from the last k sets of observations only. 

The cofactors of the matrix of correlation coefficients, required for the coeffi- 
cients of the quadratic form in the distribution, can be derived in a similar man- 
ner. The distribution may be written 


1 T /MV' 

■ - 1 + **}" m. 


of which the distribution (10) is easily seen to be the particular case for m = 0. 

From (26), the partial correlation coefficient between any pair of deviates, 
t Z v and } Z q , may be written down as 


(27) 


| flfc H~ X v X' q | + (5 P<I — 1) | fit 1 p' j 

{\n k + x P x' p \ • |n* + x.z'l} 1 ( P ’V) i; 


m this expression the summation convention is again suspended. 
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ON THE ASYMPTOTIC DISTRIBUTIONS OF CERTAIN STATISTICS 
USED IN TESTING THE INDEPENDENCE BETWEEN SUCCESSIVE 
OBSERVATIONS FROM A NORMAL POPULATION 

By P. L. Hsu 
Columbia University 

1 . The statistics to be considered here have the general expression 
T = Q Q = Z a>7& ~ x)(x, - x), S = Z (*. ~ i)\ 

O i-l i-1 

where (xi, • ■ ■ , xn) is a sample fiom a normal population whose mean and vari- 
ance can evidently be assumed to be 0 and 1 respectively . 1 The purpose of this 
note is to study the asymptotic distribution, of T assuming that the Xi are inde- 
pendent. The whole work may be regarded as a straightforward application of 
Cramer’s theory of asymptotic expansion, (see [ 1 ], pp. 69-88). 

If A = [a,j] and 7 is the row vector iV _ 4 [l, 1, • • • , 1, 1] the quadratic form Q 
has the matrix (I — y'y )A{I — y'y). The latent roots of this matrix, which are 
also the latent roots of A (I — y'y) 2 = A(I — y'y), will be denoted by 0, , ■ • • , 

X„ , with n = N — 1. Then Q and S can be simultaneously diagonalized (by a 
rotation of the TV-dimensional space), so that 

ryl, S = £,yl, 

r—1 t— l 

where the y r are again independently and normally distributed with zero mean 
and unit variance. 

We shall make the following assumptions 

(a) | X | < 1 for all r. 

(b) There is a positive number c independent of n such that 

» 1 n 
X) (X r - X ) 2 > cn, where X = - £ X r . 

r-l n r-l 

Write 

A/2± (X, - X) s x 

Z = — , S„ t(x) = 23 (Xr — x — z) , 

Vff - 2nx 2 r-i 

X, = (X, — X — z)(y\ — 1), G(x) = Pr\T < \ -f- z). 

‘The exact and the approximate distribution of such statistics were a recent subject of 
study by a number of statisticians See W. J. Dixon, “Further contributions to the prob- 
lem of serial correlation,” Annals of Math Slat., Vol 15 (1944), pp. 119-144. Further 
references are listed in Dixon’s paper 


350 



ASYMPTOTIC DISTRIBUTIONS 


351 


Then it can easily be verified that 

This expression of G(x) shows that the application of Cramer’s expansion is at 
hand, since E(X r ) = 0 and 2s 2 (a;) is the variance of 2X r . Let p kn and T kn 
stand for the same quantities as defined in Cramer’s work (see [1], pp. 70-71). 
Since moments of all order of X T exist, we may use 2k + 2 in place of k. We have 


P2Jfc+2,n 


- mk&2k+2(x) 

PT 


Tik+i.n 


\ / n 

a Wu+i > 

*P2As-f-2,n 


where nik = E(if — l) 2l+2 and y is a normal variate with mean 0 and variance 1. 

By virtue of assumption (a) | T | < 1 Therefore we may confine ourselves 
to the range of values for which | X + z \ <1. Then | X r — X — z \ < 2. Also, 
by assumption (b), s 2 (x) > 2(X r — X) 2 > cn. Hence pikyi.n , and in conse- 
quence \/nT^k+i,n , are less than some constant independent of n and x. The 
remainder of Cramer s expansion, if it is justifiable, will therefore be less than 
Mn~ k , where M is independent of n and x. The justification consists in verifying 
that the following condition is satisfied: if /,(f) is the characteristic function of 
X r and A is any positive number, then 


n 


1-U.b. II |/r(0 

r-1 


for 


l . Tyc+1,n 

V 2,s z (x) 


is less than Mi7Tfc+2.n , where Mi is independent of n and x (see [1], p. 85). Since 
Tn+i.n < i yfn 2 and s 2 (x) > c\/ n, it is sufficient to show that, if a and A are 
any positive numbers and if 


n 


U = l.u.b. II l/r(0 I 

r=l 


for 


t \ > a, 


then U < M 2 n A , where M 2 is independent of n and x. Now 
| fr(t) | = (1 + 4f 2 (X r - X - 2 )*}"* 

whence 

U = fl {1 + 4a 2 (X r - X - z) 2 } -1 . 

r-1 

Let p be the number of X r for which (X r — X — zf < \c Then cn < s 2 (x) < 
\c(n — p) + ip\ hence cn < (8 — c)p and 

U < (1 + 2a 2 c) _lM < (1 + 2 o 2 c)" (cnM(,_e)) 

This shows that the desired condition on U is satisfied, and that therefore 
Cram6r’s procedure can be adopted. 


•This follows from the fact that P«+ 2 . n > 1. Cf. Cramdr, [1], p. 70. 
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Wherever Cramer’s asymptotic expansion is valid, the terms in the expansion 
are most conveniently obtained with the help of Cornish and Fisher’s symbolic 
expression (see [2]): 


where 


<Kz) = 


V / 2tt 


f * 

f 6 -W 


dy 


and *fj is the jth semi-invariant of the random variable whose distribution is 
under asymptotic expansion. In the present case we have 


where 


21 




n 


ITT— *5 > 



Hence we may express our result as follows: 

(1) G(x) = exp [ g (£) ] *(*) + Rk(x), 

where | Rk(x) | < Mn and M is independent of n and x. The symbolic ex- 
ponential in (1) is to be expanded as far as and including the term in n~ ,(2t ~ 1) . 


2. Let us apply the result (1) to the following three statistics: T a = Q a /S, 
(o = 1, 2, 3), where 
n 

Qi = 2 (x< - £)(x,+t — x) with x N + 1 = Xi, 

>-i 

Qi = - if + - if + 2 ( x < ~ i ) (®»+i - £), 

i-i 

N-l 

Qi = 2 (xi - x)(x (+ i - x). 

i-l 

T t is simply related with T* = Q*/S, where 

Q* = 2 (*i - 

a-1 

for we have Q t — S — whence Ti — 1 - \ T* We shall write for the 
Vs corresponding to Q a , and 

6b.. = 2 & a) r, 

r-1 


(a - 1, 2,3). 
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(i) For Qi we have Xj H = cos ^ (see [3]). Since 


cos 


■*-£(«“ + o- (7) 




we have 


= whete { = 


,S»<2 j-m)HN 


If m < n, then 

ti 

Z = -1 if j ^ fra, = n if .7 = fm. 

r-»l 

JV / m \ 

Hence, for m < n, b m2 = — 1 if m is odd, b m 1 = — l A ) — 1 if to is 

/t \2^v 


even. 


In particular 


X (1> = - 1 , Z (\? - X (1) ) 2 = 


n r-1 


n — n — 2 
2 n 


> 0.4n if n > 7. 


Hence assumptions (a) and (b) are true (for n > 7). The s,(a:) are conveni- 
ently computed with the help of b m i . The /3,(x) are then computed to yield 
the terms in (1). 

(ii) The X’s corresponding to Q* are 4 sin 2 ^ (see [4]). Hence 


, (S) Tir 

X ' " 008 ^ • 

N / m\ 

By a computation similar to that in (i) we easily obtain b m2 = — ( , ) — 1 for 

even m and b m2 = 0 for odd to, provided m < 2 n. In particular, X <2) = 0, 

> -4 n for n > 5. Hence assumptions (a) and (b) are 


S(X< 2) - X (2) ) 2 = 1 


true (for n > 5) . 

(iii) In the case of Q a the matrix A is 


A = 


0 


* 0 * 
* • 


0 i 


whose latent roots are cos vt/(N + 1), (t = 1, • • • , N) (see [5]), all less than or 
equal to unity in absolute value. It follows that the same is true for the x£ a) . 
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Hence assumption (a) js true. Unlike the two previous cases, there is no sim- 
ple expression for b m3 With the help of the formula 

Ks = tr [A{I - y '7)} m 

we may compute 6 m 3 for small values of m. Thus 


5ia — — 


n + 1 


, n 2n — 1 | ft 

1)23 " 2 nTTT + (» + l) 2 


71 


, 3 (n - 1) , 3n(2n - 1) 

33 n+ 1 + 2(n + 1)* ( n + l) 3 

, 3n — 2 8n — 11 , 4n(rr - 1) (2n - l) 2 _ 2n(2n - 1) n 1 

^43 ~ rt i\/.. i i\ "T" 


8 2(77 + 1) (71 + 1)2 2(n + l) 3 (n + l) a (n +|1)« 

5(4n - 7) 5 h(8tx - 11) . 5(2 n - 1 )(» - 1) 5n 2 (n - 1) 

“ 4(n + 1) + 8 (ti + l) 2 + 2(n. + l) 2 (n, + l) a 


5ti(2ti - l) 2 5n 3 (2n - 1) 

' 1 


n 


4(n+l) 3 2(ft + l) 5 (ft + 1) 6 

k— x* n 2ft — 1 , n* — n ^ n A e ^ 

— tt + 7 — > °- 4n for n ^ 10- 

ft + 1 (ft + 1)2 — 

Hence assumption (b) is true (for ft > 10). Using these values of fc mJ we may 
compute fa(x), fa(x) and p 6 (x). By (1) we have 




G{x) = *(*) - ^Pi(x)^ 3] (x) + i (8 < (x)$ c<1 (z) + $$(x)$ (,) (x)) 

ft’ ft 


- p (|9*(*)^ w (x) - ft(*)0,(*)* m (s) + £$(x)* w (x)) + «(*), 
where | R(x) | < Mn~ 2 and M is independent of n and x. 
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NOTES 


This section is devoted to brief research and expository articles, notes on 
methodology and other short items. 


ESTIMATING THE PARAMETERS OF A RECTANGULAR 
DISTRIBUTION 

By A. George Carlton 
Columbia University 

1. Introduction. In this note, the range and mi change of the sample are 
shown to be a pah of sufficient statistics, and maximum likelihood estimates, 
for the true range and true mean of a rectangular distribution, exact and limiting 
distribution of midrange, range, and their ratio are derived; the “efficiencies” 
of the sample mean and median as estimates of the true mean are calculated; 
and the limiting distribution of the difference between two sample midranges is 
derived. All the limiting distributions are non-normal, and the error of estimate 
is of order n -1 rather than the customary order The limiting distribution 
of midrange, and the limiting ratio of variances of the midrange and sample 
mean were given by Fisher [1], 

/(x) and F(x) are used throughout to designate the probability density func- 
tion of x and the distribution function (cumulative probability function) of x; 
the argument will also indicate the random variable being considered 

2. Exact distribution of midrange, range, and their ratio. Let zi , * ■ ■ , x n 

be a set of n independent observations on a random variable having the rectangu- 
lar distribution f(x) = l/L, (9 — L/2 < x < 9 + L/2), where 6 is the true mean, 
and L the true range. The minimum observation u and the maximum observa- 
tion v are a pair of sufficient statistics for 6 and L, as the conditional distribution 
of the remaining observations for given u and v is independent of 6 and L : 

f(x i ,•••,*„! u,v) = (v - u)~ {n ~ 2) 

The midrange 0 = %(u v) and the range L = v — u are maximum likelihood 
estimates of 6 and L, respectively, as they are the parameter values which 
uniquely maximize f{x i , ■ • • , z„) for the given set of observations. We shall 
assume that the random variable is normalized by change of origin and change 
of scale so that 0 = 0 and L = 1. The joint probability density function of u 
and v is 

f (u . V) = d2p{u > *L = d ' iv - u) " 
dvd(~ u) dvd(—u ) 

= n(n — l)(i> — u) n-2 , 

355 


( 1 ) 


(-1 < U < V < i). 
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Making the transformation 0 = \(u + v), L — v — u in (1), 

(2) f(0, L) = n(n - 1)E"“ J (0 < 2 | 0 \ < 1 - Z < 1). 

Integrating out L from 0 to (1 — 2 \l |), 

m = n(i - 2 1 e\y-\ (|0|<i). 

(3) I m - m I - 4 - 4(1 - 2 1 4 |) B , (| e I < 4). 

Odd moments vanish by symmetry; even order moments are 

(4) m(6) = £ nfl 2 *(l - 2 1 I)" -1 dO = 2~ u [ + . 

In (2), integrating out d from \{L — 1) to |(1 — L), 

}(L) = »(n - 1)L" _2 (1 - L), (0 < L < 1). 

F(L) = n(n - 1) f L n ~\ 1 -L)dZ = n(n - l)B z {n - 1,2), 

Jo 

t5> (0 < E £ 1). 

„© - »«. - 1) ( E-"d -Z)dz = ( „ + g;- + »_ I) . 

Thus gi(L) = (n — l)/(n + 1); hence the bias of L can be removed by multi- 
plying L by (n + 1 )/{n - 1). 

The statistic t = 0/L can be used to test the hypothesis that the mean of a 
rectangular distribution of unknown range is 0. To obtain the distribution of t 
when the hypothesis “is true, set f = 6/L and L = L in (2) : 

fit, L ) = n(n - l)L n -\ (Z < (1 + 2 1 1 1)" 1 ). 

(6) /(f) - in - 1)(1 + 2|f|r n 

| F(t) - no) | « 4 - 4(1 + 2 1 1 I) 1 "”. 

Moments of f do not exist for order greater than (n — 2) ; for fc < n — 2, odd 
moments vanish by symmetry and 

mil) = 2 in — 1) jT f 2fc (l + 2 t)~ n dt = 2 * j ( n 2fc 2 ). 

3. Limiting distributions. 6, L, and t have non-normal limiting distributions, 
although 6 and L are maximum likelihood estimates; this is explained by the 
discontinuity of fix, 6) at x = 6 ±. We obtain the limiting distributions of 
q — n6 and r = n(l — L). Substituting q and r in (2), and proceeding to the 
limit for increasing n, 

lim fiq, r) = lim T7 1 ^1 - = e ~\ (0 < 2 ( q | < r < «). 
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The necessary simple integrations yield the following limiting distributions: 

M = 

i m - m i = h - k* w . 

= (2fc) I/2 2 ; pan i = 0. 

/(*■) = re _r , (r > 0) 

F(r) = 1 - (1 + r)e~ r , (r > 0) 

ML(r) = (fc + D! 

The limiting distribution of s = nt is the same as that of nff, as is seen by com- 
paring (3) and (6). 

4. Comparison of 6 with x and x as estimates of 6. The sample mean x and 
median x are unbiased estimates of 6. 


( 8 ) 


i r* 

w(x) — - x dx — 1/(12 n). 

71 J 

w(a) = £ *7<*) dx = £ as* (2 ^ ] tti 1)! ( * " + 


for n = 2m + 1, m an integer. Substituting z = 1 — 4x 2 , then simplifying the 
Beta function obtained on integration, 

. ... (2m +1)1 f 1 . , 1 1 

< 9 > <” (I) - rarass- i, ' (1 ~ 2) * ‘ 

1 


m I m 1 2 2m + il Jo 
(4), with k = 1, gives M 2 ( 0 ) = 


4(2m + 3) 4(n + 2) 

Comparison of this with (8) 


2 (n + l)(ra + 2) ' 

— 6?i 

and (9) shows that ^ + ^ ^ ^ and ii 2 (x) / p. 2 (x) = 3n/(n + 2). 

As rt mcreases, y^(0) / m{x) — > 6/n — > 0; and i^{x) / ^{.x) — *■ 3 Thus the “effi- 
ciency” of the mean is zero, and the median is only one-third as “efficient” as the 
mean (The concept of efficiency is not strictly applicable as 6 does not have a 
normal limiting distribution.) 

5. Limiting distribution of difference between two midranges. Let 0 X and 

02 be the midranges of samples of ni and observations, respectively, from two 
normalized rectangular populations, and let & = qi — q 2 = ni8i — n 2 0 2 . Apply- 
ing the formula for composition of random variables, one obtains from (7), 


( 10 ) 


m = /"*-,)*>*- r dq 

J— 00 J—ao 

= r e~ ih] e~ iq dq + e~ 2 ' c ' dq + [ e 2 ' 1 

J—to JO J 1 1 1 


1 e~ iq dq 


= ie" 2W + | z | e -2 ' 1 ' + Je- 21 ' 1 = (|z| + J)e 
| F(z) - F( 0) | = § - 

M2*(2) = (k + 1) (2fc) 1/2 2 *. 


2 UI 




-2lil 
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z = — _ n2 ^ - — — ^ can be used to test the hypothesis of equality of 

2 Hi — Ui 2(h 2 - «a) 

means of any two rectangular populations, and has in the limit the distribution 
(10), if the means of the populations are equal. 

6. The one-parameter rectangular distribution. If f(x) = l/X, (0 < x < X), 

then f(x i , ■ ■ i x n | v) = v l ~ n . Thus v is a sufficient statistic and is evidently 
the maximum likelihood estimate of X Here F(v) = (w/X) n , f(v) = ni>"~ 1 A~ n , 
and Hk{v) = \ k n/{n + A), The normalized error y = ?i(X — a)/X has the prob- 
ability density function f{y) = (1 — y/n ) n _1 , which tends to e~ v as n increases. 
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ON THE POWER FUNCTION OF THE SIGN TEST FOR 
SLIPPAGE OF MEANS 

By John E. Walsh 

Princeton University 

1. Summary. This note compares the power functions of the sign test for 
slippage with the power functions of the most powerful test for the case of nor- 
mal populations. The sign test is found to be approximately 95% efficient for 
small samples. 

2. Introduction. Let us consider a univariate population whose mean equals 
its median and whose cumulative distribution function is continuous a’t the 
mean. A sampling method of testing the supposition that the mean of this 
population exceeds a given constant value mo (slippage to the right) is furnished 
by considering how many values of the sample are less than mo . An analogous 
method applies for testing whether the mean is less than mo (slippage to the left) . 
A particular class of populations for which the sign test is valid are the normal 
populations. This note compares the power functions of the sign test with the 
power functions of the most powerful test for slippage for the case m which the 
population is normal (Table I) . It is shown that the sign test is approximately 
95% as efficient as the most powerful test (the Student 1-test) for samples of size 
4, 5 and 6, and that although the relative efficiency of the sign test decreases as 
the sample size increases, its efficiency is approximately 75% for samples of size 
13. This supports the idea that for normal populations little efficiency is lost 
by using attributes instead of continuous variables if the sample size is small 

In choosmg between the sign and Student 1-tests for slippage the following 
considerations may be of interest : 
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(a) The sign test is valid for a more general class of populations than the {-test. 

(b) The sign test is almost as efficient as the {-test for small samples from nor- 
mal populations. 

(c) The sign test is much more easily computed than the {-test. 

(d) The sign test has a very limited choice of significance levels for small 
samples while the {-test can have any desired significance level for any size 
sample. 

The considerations (a) to (d) also apply in choosing between the sign test and 
the Daly test based on ( x — po)/R, where x is the mean and R the range of the 
sample used for the test (see [1]). 

In section 5, Table II shows that for small size samples the significance levels 
of the sign test do not change greatly if the mean is only approximately equal 
to the median. 

3. Statement of sign test. Let *i ,••■,*» be a sample of size n from a uni- 
variate population whose mean equals its median and whose cumulative distribu- 
tion function is continuous at the mean, that is, which has the property that 

(1) Pr(x < p) = Pr{x > p) = h 
where p is the population mean. 

The significance test to decide whether p exceeds a given constant value p 0 
is defined by 

(2) If m or less of the sample values Xi , • ■ ■ , x n are less than Po , accept p > po . 
The significance test to decide whether p < no is given by 

(3) JJ m or less of x i , ■ • • , x r are greater than p 0 , accept p < mo . 

It is to be observed that in both (2) and (3) the null hypothesis tested is that 
p = go • In (2) the alternative is p > po and m (3) the alternative is p < p 0 . 

From (1) it follows immediately that (2) and (3) both have the same signif- 
icance level a(m, n), where 


a(m, n) 

(a) 

Appropriate choices of m and n will result in values of a (m, n ) suitable for sig- 
nificance tests For example 

«(0, 4) = .0624, 

a(l, 8) = 0352 

a(0, 5) = .0312, 

a(l, 9) = .0195 

a(0, 6) = .0156, 

a( 1, 10) = .0107 

a(l, 7) = .0625, 

a(2, 13) = .0112. 


If the population has a continuous distribution function, Pr{x t = x, ; i ^ j) 
= 0 In this case let £(,> be the rth largest of x x , ■ • , x n . Then (2) can be 
restated as 

(4) 


If > go j accept p > po . 
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Test (3) is seen to be equivalent to 
(5) If Zcn-m) < Mo , accept H < Mo . 

Thus for the case of populations with continuous distribution functions it is 
only necessary to determine one order statistic and compare it with p Q in order 
to apply a test. 

It is to be observed that a particular class of populations which satisfy (1) are 
those which have distribution functions which are symmetrical and continuous. 
Thus the normal populations represent a particular class for which (4) and (5) 
are valid. 


4. Comparison, with Student f-test. Consider the case m which the popula- 
tion is normal with mean n and variance <r 2 . Then the power function for (4) 
is given by 


Power Function = Pr(x ( mH p > pf) 




MO M 


) 


ml(n 


-l-i)i /. {L m dy ) (I m dy ) /<x) ** 


where 


1 ,-»»* 


Kv) = — 7 =- e 
V 2ir 


and 6 = 


Mo — M 


For a normal population, however, it is well known that the most powerful 
Studentized test of the one-sided alternative p > /io is the appropriate Student 
(-test. Values of the power function for the f-test are found for given values of 
S by using the normal approximation given in [2], 

The method of measuring the relative efficiencies of the two types of tests will 
be different from the common method of measuring the relative efficiencies of 
estimates, which consists m taking the ratio of the variances of the two esti- 
mates as the measure of their relative efficiency. The principle followed here 
will be to consider a sign test based on a given sample size and vary the degrees 
of freedom of the f-test having the same significance level until the power func- 
tions of the sign test and f-test agree in the sense that in the half-plane <5 2 = 0 
the area between the two power curves for which the sign test power function 
exceeds the f-test power function is equal to the analogous area for which the 
sign test power function is less than the f-test power function. The considera- 
tions are limited to the half -plane 5 i 0 because the test is one-sided. The size 
of the f-test sample having this property divided by the size of the sign test sam- 
ple is called the relative efficiency of that sign test. Intuitively this relative 
efficiency measures how much more data must be added if the sign test is to 
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furnish an amount of information equivalent to that supplied by the <-test. In 
obtaining the relative efficiencies in the manner described above, the degrees of 
freedom of the f-test are allowed to assume fractional values and the values of 
the power function are computed using the normal approximation as if it were 
valid for fractional degrees of freedom. The number of degrees of freedom, of 
course, can only be integral This method, however, gives an interpolated 


TABLE I 

A comparison of the power functions of the sign and t tests 


Test 

m 

n 

Approx- 

imate 

Relative 

Efficiency 

Significance 

Level 

Values of Power Function 

5=— i 

8“— 1 

S=-1J 

a«=—2 

t 


3.8 


.0624 

.219 

.484 

.755 


sign 

0 

4 

95% 

.0624 

.229 

.500 

.755 


t 


4.8 



.150 

.402 

■a 

.909 

sign 

0 

5 

96% 


.159 

.420 

m 

.888 

t 


5.7 


.0156 

.098 

.330 

.660 

.899 

sign 

0 

6 

95% 

.0156 

.110 

.355 

.655 

.863 

t 


5.6 


.0625 


.695 

.932 

.995 

sign 

1 

7 

80% 

.0625 

.311 

.711 

.920 

.988 

t 


6.4 


.0352 

.225 

.619 

.908 

.989 

sign 

1 

8 

80% 

.0352 

.239 

.630 

.869 

.978 

Hi 


7.4 



.171 

.565 

.893 

.988 

ISP 

1 

9 

82% 


182 

.573 

.879 

.974 

t 


8 


.0107 

.117 

.468 

.848 

.983 

sign 

1 

m 

80% 

.0107 

.137 

.515 

853 

.964 

IP 


9.75 


mm 

.162 

m 

.950 

.998 

1 mm 

2 

13 

75% 

I 

.165 

ESI 

!949 

.998 


measure of the size sample of the f-test having the properties outlined above. 
Table I supplies a comparison of the relative efficiencies and the powers of the 
sign test and the f-test obtained in the manner just described. Thus for samples 
of size 4, 5 and 6 the sign test is approximately 95% as efficient as the Student 
f-test. The relative efficiency decreases as the size of the sample increases but 
even for samples as large as 13 is approximately 75%. 
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For normal populations it is also well known that the most powerful Student- 
ized test of the alternative p < ju 0 is given by the appropriate Student t- test. 
It is clear that Table I can also be considered as a comparison of test (5) with 
the corresponding Student 4-test if S is replaced by -5 and m by n — m. 

5. Approximate cases. Suppose that (1) is only approximately satisfied by 
the population in question. 

Let Pr(x < n) = | + r. Then the significance level of (2) is 

m -y) ! 

{6) 

Significance levels of (2) for small size samples are given in Table II as a func- 
tion of r. 


TABLE II 

A comparison of the significance levels of the sign test when the mean differs from 

the median 


m 

n 

Significance Level 

r=0 

— 


r** 02 

r=.05 

0 

4 

.0624 

.073 

, 

mm 


0 

5 

.0312 

.038 


■a 

; 

0 

6 

.0156 

.020 


.012 



Table II shows that for small samples the significance level of (2) does not change 
greatly from aim, n) if (1) is only approximately satisfied. Expression (6) 
shows, however, that for large size samples even a small value of r can cause a 
large change in the significance level of (2). 

For Pr{x < p) - \ + r it is apparent that the significance level of (3) is (6) 
with r replaced by —r so that Table II applies to tests (3) if this replacement is 
made. 
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AN APPROXIMATION TO THE PROBABILITY INTEGRAL 


By J. D. Williams 

United States Naval Ordnance Test Station , Inyokem, California 
1. Summary. It is shown that 

and that the equality is never in error by as much as three-fourths of one percent. 
Other approximations are discussed. 


2. For use on those occasions when an approximate analytic expression for 
the integral 


(1) 


p(x) 


1 

V^r 



dt 


is desired, the approximation 

( 2 ) v' (x) = [1 - 


is simple and reasonably accurate. An approximation equivalent to this is 
quite commonly used in problems involving a bivariate normal distribution, 
but its use in the one-dimensional case seems to be less well known. 

We shall first show that p(x) < p'{x) and then estimate, by calculation, 
the relative error made when the equality is accepted. 


(3) 


-IhFX 

<ri/" 7 

|_2tt Jo Jo 


e !|S dt 


e -m+<v dh dk J 

i2r 


re 


■Jr* 


dr dd 


= [1 - = p'(x), q.e.d. 


The approximation, introduced at the stage of passage to polar coordinates, 
comprises replacement of the square region of integration — x < x t < x by a 

2 

circular region, 0 < r < x, having the same area. Since we are dealing 

with a circular normal distribution with zero means, the region of fixed area 
which covers the greatest density is a circle whose center is at the origin. 
Therefore our square region of area 4x 5 must contain less density than the cir- 
cular region of area 4x 2 by which we have replaced it. 

The maximum value of the relative error, 


€ p 


_ V'(x) 
p(x) 


~ 1, 
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(4) 
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is found by calculation to be about seven-tenths of one percent, as may be judged 
from Table 1, column 3. 

The question may be asked: Can the relative error be reduced by suitable 
choice of the parameter c in 

(5) p’Cx) = [1 - 

Calculation indicates that by taking c = 0.6302 the relative error is reduced to 
about one-half of one percent; but this gain iB offset, for many purposes, by the 
loss of the inequality (3), 

The density function implied by (2), namely 

(6) p'(x) = l£) - e- (2lT)xi }-\ 

IT 

has the variance 

(7) <r 4 = t (1 - log 2) = 0.964. 

If c is determined so that the density function will have unit variance, then 
(5) becomes 

• Hri-' 

this approximation to (1) leads to relative errors of almost two percent, which 
occur when x is small. 

The density function (6) may be used to judge the quality of (2) in approxi- 
mating to an integral of the form 

(9) p(xi , x 2 ) = f e'* 1 * dt, 

V 2ir Jx i 

the approximation being 

(10) p' ( xi , a: 2 ) = ^ [p' (x 2 ) - p' (x t )] 

when X! and x 2 are positive (which is the severe case). It is evident that the 
relative error in accepting (10) for (9) cannot exceed the greatest relative dis- 
crepancy t p , in the interval Xi < x < x 2 , between density function (6) and the 
normal density 

an .<») 

The quantity 


( 12 ) 


_ p'(x) 
' p(*) 


- 1 


is tabulated in Table 1, column 6, from which it appears that the relative error 
committed in using (10) for (9) will surely be less than one-and-a-half percent 
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provided 0 < x, < 1.8; but the relative error may be very great when the inter- 
val of integration lies beyond x = 1.8. 

The approximations described herein were suggested by the following situa- 
tion, encountered in work done by the Applied Mathematics Panel, NDRC: 
The probability P of at least one success, defined by — x < x, < x, in a sample 


TABLE 1 


X 

p'(x) 

p(x) 

■Hgi 

p '(*) 

p(s) 

«p 

.0 

0 

0 


.3989 

.3989 

0 

.1 

.0797 

.0797 

1 

.3969 

.3970 

.0005 

.2 

.1586 

.1585 


.3914 

.3910 

.0010 

.3 

.2360 

.2358 

.0008 

.3821 

.3814 

.0018 

.4 

.3112 

.3108 

.0013 

.3695 

.3683 

.0033 

.5 

.3836 

.3829 

.0018 

.3539 

.3521 


.6 

.4526 

.4515 

.0024 

.3356 

.3332 


.7 

.5177 

.5161 

.0031 

.3151 

.3123 


.8 

.5785 

.5763 

.0038 

.2929 

.2897 


.9 

.6347 

.6319 

.0044 

.2695 

.2661 



.6862 

.6827 

mmm 

.2454 


.0141 

1.1 

.7329 

.7287 

■ 

.2211 

.2179 

.0147 

1.2 

.7747 

.7699 

19 

.1971 

.1942 

.0149 

1.3 

.8118 

.8064 

.0067 

.1738 

.1714 

.0140 

1,4 

.8443 

.8385 

.0069 

.1516 

.1497 

.0127 

1.5 

.8725 

.8664 

.0070 


.1295 


1.6 

.8967 

.8904 

.0070 

.1113 

.1109 


1.7 

.9171 

.9109 

.0068 




1.8 

.9341 

.9281 

.0065 




1.9 

9485 

9426 

0063 


.0656 


2.0 

.9600 

.9545 

.0058 


.0540 



of n pairs (xi , x 2 ) from a population in which the independent component prob- 
abilities are p(x ), is 

(13) P = 1 - [1 - p’(x)]". 

A little numerical exploration, supplemented by examination of the limiting 
values as x — ► 0 and x — * <*> , revealed that when P is fixed the quantity log n is 
very nearly a linear function, of slope minus two, of log x; so nearly, in fact, 
that one was encouraged to posit the linearity and observe the consequences 
This yielded (5), which became (2) by requiring that it go to zero with x in the 
same manner as (1). 
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DISTRIBUTION OF THE RATIO OF SAMPLE RANGE TO SAMPLE 
STANDARD DEVIATION FOR NORMAL AND COMBINATIONS 
OF NORMAL DISTRIBUTIONS 


By G. A. Baker 

College of Agriculture, University of California at Davis 

1. Introduction. The distribution of sample ranges in terms of the stand- 
ard deviation of the sampled population for homogeneous populations has been 
dealt with in some detail by mathematical methods for the normal parent and by 
empirical sampling methods for non-normal parents. These results are pre- 
sented in summaiy in Tables XXII, XXIII, and XXIV of [1]. Bliss [2] suggests 
that the range m different sized samples from a normal parent at various levels 
of significance, in terms of the standard deviation computed with varying degrees 
of freedom, would be a valuable table. It is not clear whether he means that 
the standard deviation is to be estimated from the same sample as the range or 
from a second independent sample, as is done by Newman [3], Pearson and 
Hartley [4], and Hartley [5]. 

In natural hybridization of distinct types of plants and subsequent back cross- 
ing with parental types distinctly bimodal populations may develop. Heiser 
[6] has described such a situation for sunflowers. Similar situations may occur 
in natural and artificial crossing of peaches and apricots as shown by the work of 
Hesse [7] of this station. In studying such genetical material it often would be 
helpful to know the expected distributions of t the sample ranges in terms of the 
sample standard deviations estimated from the same sample for certain typical 
nonhomogenous populations. Applications to such data will be published 
elsewhere. 

Since the mathematical situation for the distributions of the sample range 
( R ) in terms of the sample standard deviation (s) appears somewhat complex, 
empirical sampling methods were resorted to for obtaining the distributions for a 
normal parent (N), a symmetrical distinctly bimodal nonhomogeneous parent 
(A) , and a weakly bimodal but strongly skewed parent (B) . Populations A and 
B are pictured in charts A (p. 341) and B (p. 348) of [8], 

Population N is approximately represented by 

1296 , (X - 15.6) 5 

m - * -XT - 1 ■ 


population A by 
648 


( 4 ) 


5\/2 


8 / . (X - 15,5)* , . (X - 32.5) : 

\exp. “ h ^ + exp. - * 


25 


25 


)■ 


and population B by 
972 


(B) 


5\/2 


2 / , (X - 15.5) s , , . (X - 31.5) 2 \ 

% V** ~ 4 25 — + 1 exp - * at ) ' 


25 
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The method of drawing samples is the same as that originally described in [9]. 
N, A, and B each have a total area of 1296. Thus, 1296 integers distributed 
over a proper range and with the frequencies indicated by the corresponding 
areas under the curves N, A, and B were entered on charts with 6 big rows and 
6 big columns of squares which were subdivided into 6 little rows and 6 little 
columns. In each case the 1296 integers were distributed in a non-systematic 
way among the 1296 little squares. By throwing 4 differentiated dice (one die 
assigned to a big row, one to a big column, one to a little row, and one to a little 
column) it was possible to draw random individuals from populations that are 
approximately N, A, and B. 

Fisher [10] has defined gi which measures the skewness of a distribution and <72 
which measures the flatness. These g’s are equivalent to the square root of ft 
and ft — 3, respectively in Karl Pearson’s older notation. For population A, 
<71 = 0 and 02 = — 1-10. For population B, gi = 0.62 and g 2 = —0.29. 

TABLE 1 


Distribution of range in terms of sample standard deviation for samples of specified 
sizes from a normal parent population (N), gi = 0, g 2 = 0 


Sample 

Size 

Number 

of 

Samples 

Mean 

Standard 

Devia- 

tion 

01 

Standard 
Error of p, 
(Normal) 


Standard 
Error of g, 
(Normal) 

2 


1.4142 

0.0 



0 0 


4 

mm 

2.2238 

0.1564 



0.434 

0.1400 

16 

■@9 

3.5112 

0.3879 

0.115 


0.135 

0.2783 

36 

135 

4.4014 

0.6076 



0.332 

0.4142 

64 

76 

4.8272 

0.6409 

0.492 

0.2756 

-0.751 

0.5448 

100 

48 

5.1215 

0.6616 



1.038 

0.6744 


2. Empirical random sampling results. The sample sizes considered are 2, 4, 
16, 36, 64, 100. The distribution functions for various sample sizes are char- 
acterized by givmg means, standard deviations, gi’s, and g 2 ’s. The results are 
given in Tables 1 , 2, and 3. The standard deviations of the samples were com- 
puted by dividing the sum of squares by one less than the number in the sample. 
When the size of the sample is two then the range divided by the standard devia- 
tion of the sample is always a constant, square root of 2. 

The constants for the distributions for all sample sizes except four were com- 
puted without grouping. The constants for the distributions for samples of 
four were computed from grouped data with a small class interval. 

3. Discussion. The mean values of the range divided by the standard devia- 
tion of the sample for population A run lower than for populations N and B. 
The standard deviations of the distributions for all parents increase from zero 
and continue to increase throughout the range considered for population N. 
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The standard deviations cut down much more quickly for population A than 
for population B. The values of (h and pa show that the distributions are sig- 
nificantly non-normal for certain sample sizeB but perhaps not seriously so for 
other sample sizes. 

The distributions of range divided by the sample standard deviation are quite 
different from the corresponding distributions of range in terms of the standard 
deviations of the population aB can be seen by reference to the tables in [1] . 

TABLE 2 


Distribution of range in terms of sample standard deviation for samples of specified 
sizes from a bimodal symmetrical population ( A ), g i = 0, pj = —1.10 


Sample 

Size 

Number 

of 

Samples 

Mean 

Standard 

Devia- 

tion 

0i 

Standard 
Error of ffi 
(Normal) 

01 

Standard 
Error of a, 
(Normal) 

2 


1.4142 

0.0 

0.0 




4 

1040 

2.2050 

0.1551 

-0.468 

0.0758 


0.1516 

16 

259 

3.5742 

0.5283 

1.025 

0.1514 

1.182 

0.3015 

36 

115 

4.0690 

0.4604 

0.561 

0.2255 


0.4474 

64 

64 

4.3194 

0.3377 

0.106 

0.2993 

-1.829 

0.5905 

100 

41 

4.4846 

0.3194 




0.7245 


TABLE 3 

Distribution of range in terms of sample standard deviation for samples of specified 
sizes from a skewed bimodal population (B), gi = 0.62 , g* = — 0.29 


Sample 

Size 

Number 

of 

Samples 

Mean 

Standard 

Devia- 

tion 

01 

Standard 
Error of gi 
(Normal) 

9* 

Standard 
Error of at 
(Normal) 

2 


1.4142 


0.0 


0.0 


4 

1061 

2.2258 

0.1459 

-0.470 

0.0751 

-0.142 

0.1500 

16 

265 

3.9277 

0.5938 

0.540 

0.1496 

0.405 

0.2982 

36 

117 

4.4792 

0.5476 

0.400 

0.2236 

0.018 

0.4437 

64 

66 

4.8485 

0.5249 

0.534 

0.2950 

1.028 

0.5906 

100 

42 

5.0481 

0.3626 

-0.092 

0.3655 

-0.632 

0.7166 


At the suggestion of the referee it is noted that the empirical results for the 
means in Table 1 are rather well approximated by E(R)/E{s). It is necessary 
to remember that E{s) ?£ <x for small samples. For a discussion of E(s) see 
Kenney [11] equation 28, page 135. 

It is also noted that if 

X = log (log sample size — log 2) 
y = log ^mean — 0 / 2 ^ 
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then the plots of the (X, Y) values in each case are approximately straight lines 
for the present range in sample sizes. 

The standard deviation and range when determined from the same sample 
are correlated. For the normal population this correlation decreases and prac- 
tically disappears for samples of 100 or greater. This is not true for populations 
A and B. For these populations the correlation between sample range and 
sample standard deviation decreases much more slowly and seems to be of the 
order of 0.5 for samples of 100. 
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Readers are invited to submit to the Secretary of the Institute new items of interest 

Personal Items 

Dr, Theodore W. Anderson., Jr. of the Cowles Commission for Economic Re- 
search has been awarded a Guggenheim Memorial Foundation Fellowship. 

Assistant Professor Theodore A. Bancroft of Iowa State College has been 
appointed to an associate professorship at the University of Georgia 

Dr. Z. W. Birnbaum is now an associate professor in the mathematics depart- 
ment at the University of Washington, 

Mr. Albert H. Bowker and Mr. Edward Paulson, formerly with the Statistical 
Research Group, of Columbia University, have been awarded pre-doctoral 
fellowships in mathematical statistics by the National Research Council. They 
are now studying at Columbia University. 

Mr Oscar K. Buros, of Rutgers University, is Review Editor of the Journal of 
the American Statistical Association. He is making the review section a very 
important part of the Journal with such features as replicating reviews and biblio- 
graphies of statistical methodology. Members of the Institute who are authors of 
papers and books (both English and non-English) on statistical methodology are 
urged to send a reprint, review copy, or bibliographic information to Mr. Buros 
as soon after publication as possible. 

Professor Harold Cramer, Director of the Institute of Mathematical Statistics 
at the University of Stockholm will be a visiting professor at Princeton Uni- 
versity during the fall semester of the 1946-1947 academic year. He will give 
a course of graduate lectures on the theory of probability. 

Dr. J, II. Curtiss has been appointed assistant to the Director of the National 
Bureau of Standards, where his duties will include the administration of the math- 
ematical and statistical activities of the Bureau. Dr. Curtiss served in the U. S 
Naval Reserve during the war, and recently received a Commendation Ribbon 
from the Secretary of the Navy for his work in statistical engineering for 
the Bureau of Ships and the Office of the Commander-in-Chief. He will con- 
tinue to be on leave of absence from Cornell University throughout the academic 
year 1946-1947 Administrative direction of the Mathematical Tables Project 
of the National Bureau of Standards has been assigned to Dr. Curtiss. Members 
of the Institute are cordially invited to visit the Project when in New York City, 
and to confer with the Project Director, Dr. Arnold Lowan, concerning their 
computational problems. The address of the Project is 150 Nassau Street, New 
York City. The Project is currently supported by funds transferred to the Bu- 
reau from the Office of Research and Inventions of the Navy Department An 
Advisory Panel of mathematicians interested in the computation of tables is 
being formed to define the long range program of the Project. An announce- 
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ment as to the personnl of this panel will appear in a later issue of the Annals. 

Assistant Professor W. J. Dixon of the University of Oklahoma has been 
appointed to an associate professorship at the University of Oregon 

Dr Hallett H. Germond has returned from war service to his teaching duties 
in the Department of Mathematics at the University of Florida 

Dr. Earl L. Green, has accepted a position as Associate Professorof Zoology 
at Ohio State University. 

Mr. John C. Hintermaier, formerly supervisory chemist with the Forstmann 
Woolen Company of Passaic has accepted a position as chief chemist of the Van- 
ity Fair Mills at Reading. 

Mr William Hodgkinson, Jr , has returned from war service to his position 
with the American Telephone and Telegraph Company at New York. 

Mr. Robert H. Hoskins, discharged from the Navy in March, is employed in 
the Actuarial Ordinary General Division of the John Hancock Mutual Life 
Insurance Company at Boston. 

A testimonial dinner was given to Professor Harold Hotelling on May 3, 1946 
at the Columbia University Men’s Faculty Club as a farewell by the Statistical 
Techniques Group, New York Chapter, American Statistical Association. 
Professor Hotelling is leaving Columbia at the end of the academic year to be- 
come Professor of Mathematical Statistics at the University of North Carolina. 
Professor Helen M. Walker, on behalf of the Group, presented gifts to Professor 
and Mrs. Hotelling. The Chairman, Professor Irving Lorge, introduced the 
distinguished visitoi s who came to honor Professor Hotelling Among the speak- 
ers were Professor P. C. Mahalanobis of Presidency College, Calcutta, India, 
Dr. Stuart Rice, Chairman of the Statistical Commission of the Economic and 
Social Council of the United Nations, and Dean Pegram of the Graduate Facul- 
ties of Columbia University. Professor Hotelling reviewed the changes in sta- 
tistical theory and techniques that were developed during the 15 years of his 
professorship at Columbia University. 

Mr. Calvin J. Kirchen, who has recently accepted a position with the technical 
department of Remington Arms Company at Bridgeport, Conn., addressed the 
Rochester Society of Quality Control Engineers on Sept. 17 on "The Applica- 
tions of Sequential Analysis to Acceptance Inspection”. 

Dr. Walter Leighton of the Rice Institute has been appointed to a professor- 
ship at Washington University. 

Miss Dorothy Marrow has been appointed to an assistant professorship at 
George Washington University 

Professor D E. Morton of the National Bureau of Econlmic Research is 
joining the faculty of Cornell University 

Assistant Professor Cecil J. Nesbitt of the University of Michigan has been 
promoted to an associate professorship. 

Dr. A. C. Olshen has accepted a position as Actuary of the West Coast Life 
Insurance Company at San Francisco. 



372 


NEWS AND NOTICES 


Mr, William B. Rice has opened an office as Consulting Business Statistician 
at 1011 South Los Angeles Street, Los Angeles. 

Mr. John Salerno, formerly a draftsman (statistical) with the War Department 
is now Mathematician with the U. S. Coast and Geodetic Survey. 

Assistant Professor Henry Scheffd of the Mathematics department of Syracuse 
University has been appointed associate professor of engineering at the University 
of California at Los Angeles. Professor Scheffd has been awarded a Guggenheim 
Memorial Foundation Fellowship. 

Mr. William B. Simpson has returned from overseas and is attending the Uni- 
versity of Chicago 

Professor Geoge W. Tyler has returned to his position in the Mathematics 
Department at Virginia Polytechnic Institute, having spent two years at the 
University of California Division of War Research. 

Professor W. Allen Wallis, who returned to his position at Stanford University 
in April after serving for nearly four years as Director of Research with the 
Statistical Research Group of Columbia University, has accepted a position as 
Professor of Statistics and Economics in the School of Business of the University 
of Chicago effective September 1, 1946. 

Mr. Frank A. Week who served during the war as a Captain in the Office of the 
Surgeon General is now in the Actuarial division of the Metropolitan Life In- 
surance Company. 

The University of Pennsylvania held a conference on “Measurement of Con- 
sumers Interest” at Philadelphia on May 17-18, 1946. This conference was 
sponsored by the Departments of Philosophy, Psychology, Statistics, Marketing, 
and Foreign Commerce. Among the speakers were the following members of 
the Institute: Professor L. L. Thrustone of the University of Chicago, Professor 
Louis Guttman of Cornell University, Dr, W. Edwards Deming of the Bureau 
of the Budget, Professor C. West Churchman of the University of Pennsylvania, 
Dr. John H. Curtiss of the National Bureau of Standards, Professor Paul Peach 
of the University of North Carolina, and Professor S. S. Wilks of Princeton Uni- 
versity. 

The following four doctorates, with mathematical statistics as a major subject, 
were conferred during 1945 in the United States. The name, University, month 
in which the degree was conferred, and the title of the dessertation are given in 
each case: 

T. W. Anderson, Jr., Princeton, June, “The Non-Central Wishart distribution 
and itB application to Problems in Multivariate Statistics.” 

Frances Campbell, Michigan, June, “A Study of Truncated Bivariate Normal 
Distributions.” 

W, M. Chen, California, June, “Power Function of the Analysis of Variance and 
Convariance of a Normal Bivariate Population.” 

J. J. Livers, Michigan, February, “Use of Partitions in Multivariate Moment 
Sampling Theory.” 
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Professor A. R Crathorne of the University of Illinois, a Fellow of the In- 
stitute and one of its founders, died on March 7, 1946 at the age of 72 


Announcement of New preliminary Actuarial Examinations 

On June 7, 1947, three new Preliminary Actuarial Examinations will be given 
to undergraduate students of mathematics and others who may be interested in 
going into the actuarial profession. These new examinations are sponsored 
jointly by the Actuarial Society of America and the American Institute of Ac- 
tuaries. 

The new series of examinations will replace Parts 1, 2, and 3 of the actuarial 
examinations which have been given heretofore, but will carry the same credit 
toward Associateship in the two actuarial organizations. These examinations 
have been prepared under the direction of a joint committee of actuaries and 
mathematicians. They will be administered by the College Entrance Examina- 
tion Board at centers throughout the United States and Canada. 

Descriptions of the three new examinations are as follows: 

1. Language Aptitude Examination. This is a three-hour aptitude examina- 
tion testing reading comprehension and precise knowledge of the meaning of 
words. It is similar to the well-known Scholastic Aptitude Test of the College 
Entrance Examination Board, except that it is pitched at approximately the 
college sophomore level . Verbal facility and command of the English language, 
as well as mathematical ability, are important in the actuarial profession. This 
is not the type of an examination for which specific preparation can be made; 
it is an aptitude rather than an achievement examination. 

2. General Mathematics Examination. This is a three-hour achievement 
examination on material usually covered in the first two years of mathematics 
in colleges and universities in the United States and Canada. More speci- 
fically, it is based on college algebra, trigonometry, analytical geometry, and 
differential and integral calculus. It is designed to be taken by the mathe- 
matically talented undergraduate at the end of his sophomore year, although 
it is not restricted to this group. 

3. Special Mathematics Examination. This is a three-hour achievement 
examination based on the material usually covered in undergraduate courses 
in finite differences, probability, and statistics. It is designed to be given at 
the end of the junior or senior year to college mathematics majors who have 
either taken courses or done concentrated reading in these fields, but it is not 
restricted to this group. 

The two actuarial bodies will jointly award one $200 and eight $100 prizes 
to the nine highest-ranking contestants on the basis of performance on the first 
two of the examinations described above. In determining these awards the 
General Mathematics Examination will be weighted twice as much as the 
Language Aptitude Examination, 
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Information regarding these new examinations, and applications for taking 
them, may be obtained from either of the following organizations: 

The Actuarial Society of America 
393 Seventh Avenue 
New York 1, New York 

The American Institute of Actuaries 
720 North Michigan Avenue 
Chicago, Illinois 


Announcement of Cowles Fellowships for Women 

TwoSarahFrancesIiutchinsonCowlesFellowships forwomenwill be awarded 
by the University of Chicago for the academic year 1947-48 upon nomination by 
the Cowles Commission for Research in Economics. Applicants must be stu- 
dents of outstanding promise, preparing for the degree of master or doctor in the 
field of social sciences and statistics, preferably in quantitative economics or 
mathematical statistics. The Fellowships amount to $1000 each, but may be 
supplemented by an additional grant of $500 if the work of the Fellowship holder 
lies within the Cowles Commission’s field of interest. Holders will be expected 
to be in residence at the University of Chicago. Application and supporting 
documents must be filed before March 1, 1947. Application blanks and further 
particulars may be secured from the Cowles Commission for Research in Eco- 
nomics, The University of Chicago, Chicago 37, Illinois, U. S. A. 


New Members 

The following persons have been elected to membership in the Institute: 

Alger, Philip L., M.S. (Union) Staff Ass’t to Mgr of (Eng , Gon. Elec Co Schenectady, 
N Y,, 1758 Wendell Ave , Schenectady 8, N. Y. 

Baer, Prof. Relnhold Ph.D (Gottingen) Dept of Math. U. of 111., Urbana, 111 

Behrends, Stanley George, Li B, (La Salle) Ass’t. Purchasing Agent, 4S9-65ih St , Oak- 
land , 9, Calif. 

Benford, Frank, B.E.E. (Michigan) Physicist, 1648 Rugby Rd., Schenectady ■'£, N. Y. 

Burke, H. D., Chief of Inspection and Qu&l. Control, The Coleman Co Inc., Wichita 1, 
Kansas. 

Church, Assoc. Prof. Randolph, Ph.D. (Yale) Postgrad. School, U. S, Naval Academy, 
Annapolis, Md , 816 N. Qlen Ave., Annapolis, Maryland. 

Delhi, Douglas George, M.A. (Drake) Statistician, Tuberculosis Control Div , U S, 
Public Health Service, 8896 Porter St , N W. Wash. 16, D C 

Dimsdale, Bernard, Ph.D. (Minnesota) Instr Purdue U,, 464 Washington Ave., Glencoe, 
III. 

Eaves, James C., M.A. (Kentucky) Instr. Math Dept, of U of N. C., Chapel Hill, N. C 

Elvehack, Lillian R., B A. (Minnesota) Instr Biostatietics Dept., School of Public Health, 
Columbia Univ , 600 W. 16Bth Si., N Y. SS, N. Y. 

Harris, Theodore E., B.A (Texas) Student, Graduate College, Princeton, N. J. 
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Hlrsch, Warren M., B B.A. (New York) Teacher-NYC High School System, 2791 Umv 
Ave., Bronx, 'N Y. 

Hughes, Harry M., M A. (Texas) Coat Accountant, Maritime Commission, 1454 Bancroft 
Way, Berkeley 2, Calif. 

Jaramillo, Trinidad J., Ph D (Chicago) Research Mathematician, 1947 So Kedzie Am , 
Chicago 23, III. 

Jones, Warren E., B A. (Maryville) Owner and Pres, of Management Controls, 699 Rose 
Ave., Dee Plaines, III. 

Kallnowskl, Walbert, Graduate Student in Math, and Statistics, 3689 W. Pine Blvd , St. 
Louis 8, Mo. 

Keeney, Roger D., A.B. (Bucknell) Actuarial Clerk, Metropolitan Life Ins Co., N. Y., 
N. Y., 110 Fournier Crescent, East Paterson, N J 

Kelslar, Evan R., Ph D (California) Instr., Princeton U., also Research Assoc. College 
Entrance Exam. Board, Nassau Club, Princeton, N. J 

Keppler, Wharton Fields, B.A. (Ohio State) Math Statistician, M & R Dietetic Lab , 
Inc , 8 E Long St , Columbus 16, Ohio. 

Kubis, Assoc. Prof. Joseph F., Ph D (Fordham) Dept, of Psychology, Fordham U, Grad. 
School, N. Y., N. Y 

Leepin, Peter, Ph D. (Basle) Actuary-Basler Life Ins. Co., Gellerstr. 62, Basle, Switzer- 
land 

Lefever, Prof. David Welty, Ph D. (S. California) Dept of Education, U. of S Calif., 
University Park, Los Angeles, Calif. 

Likert, Rensis, Ph.D. (Columbia) Head of the Div. of Program Surveys, B.A E Dept, 
of Agriculture, Washington, D C 

Marks, Eli S., Ph.D. (Columbia) Principal Business Economist, OPA Wash., D C , 3711 
Horner Place S E , Washington 20, D C. 

Martin, Prof. William Ted, Ph D (Illinois) Dept, of Math., Syracuse U., Syracuse 10, 
N. Y 

McGann, Paul Williamson, A.B (Brown) Acting Section Head, Bldg., Material Equip 
Constr Price Div , OPA, 2700 Wisconsin Ave , N . W ., Washington 7, D. C 

Michael, William Burton, M S. (S, California) Lecturer m Math. Psychology, Education, 
388 So Oak Ave., Pasadena 8, Calif. 

Muench, Prof. Hugo, Dr P.H. (JH.U ) Dept, of Biostatistics, Harvard School of Pub. 
Health, 55 Shattuck St , Boston 15, Mass 

Murphy, Barbara M., Librarian, of Raytheon Mfg Co , Power Tube Div , Foundry Ave., 
Waltham 54, Mass 

Murray, Janet H., A.M (Stanford) Asst Head-Family Economics Div , Bureau Human 
Nutrition and Home Economics, U. S Dept of Ag , 1025 Connecticut Ave , Washington 
6, D C. 

Nemmers, Frederic E., M.S. (Iowa) Instr , U of Wisconsin, 2936 N. Hackelt Ave , Mil- 
waukee 11, Wisconsin 

Neurdenhurg, M. G., D.P.H (Amsterdam) Head of the Bureau of BuBmess-Control and 
Statistics of the Municipal Health Dept of Amsterdam and Honorary secretary of the 
General Netherlands Society for Public Health and Social Medicine, Frans Van Mier- 
isslraat 134, Amsterdam Zuid 1, Holland. 

Noel, Roland H., M S (Massachusetts Col. of Pharmacy) Special Asst, to Production 
Mgr Penicillin Div., Bristol Labs. Inc , Thompson Rd., Syracuse, N Y. 

Nordqulst, John M., M.S. (Oklahoma) Research Asst. Seismological Lab. 220 N San Rafael 
Ave., Pasadena 2, Calif 1695 Corson St., Pasadena 4, Calif. 

O’Connor, Howard J., M.A. (Toronto) Technical Asst., Development Div. Union Car- 
bide and Carbon Research Labs. Inc., 137 -47th St , Niagara FallB, N. Y , 1016 Cleveland 
Ave , Niagara Falls, N. Y. 

Odle, John W., Ph.D. (Michigan) Head, Math, Sec., Research and Development, Naval 
Ordnance Test Station, Inyokern, Calif 
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Pascua, Asst. Prof. Marcellno, M D, (Madrid) Dept of Biostatiatics, Johns Hopkins 
Univ., 615 N. Wolfe St., Baltimore, 5, Md. 

Perlsteln, Mae, B.A. (Hunter) Teaching Asst. U of Calif., tJjfll Durant Are., Berkeley 4, 
Calif. 

Perrott, Major Ivan Brian, M.A. (Oxford) R. Signals B A.O.R., 17 Widney Manor Rd., 
Solihull, Warmckshire, England, 

Price, Prof. Griffith Baley, Ph.D. (Harvard) Dept, of Math., 205 Frank Strong Hall, U. of 
Kansas, Lawrence, Kansas. 

Reid, David Buchanan William, B A. (McGill U, -Montreal), Graduate Student-statistics, 
V.P.I., P, 0. Box 431, Blacksburg, Virginia. 

Reynolds, John Hughes III, M.A. (U. of the South) Technical Control Statistician, Cela- 
nese Corp of America, Tubize Div , Rome, Georgia. 

Reynolds, William A., M,A (California) Research Associate, National Broadcasting Co., 
30 Rockefeller Plaza, New York 20, N. Y. 

Salerno, John, B A. (Brooklyn) Draftsman (Statistical), 530 Lincoln Ave., Brooklyn 8, 
N. Y. 

Shaw, Byron T. f Ph D. (Ohio State) Principal Agronomist, Plant Industry Station, Belts- 
ville, Maryland. 

Shephard, Asst. Prof. Ronald W., PhD. (California) Dept, of Math., Purdue U., 
Lafayette, Ind. 

Simms, Clifford Raymond, M.S, (Michigan) Consulting Actuary, 1038 Connecticut Ave., 
N, W., Washington, D. C 

Sprengel, Herbert J., MS. (Illinois) Quality Control Engineer, SOS N. Lombard Awe., 
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Stone, John Richard Nicholas, M.A (Cambridge) Director of the Dept, of Applied Eco- 
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Tweedy, Marjorie A. L., B.S. (Ohio State) Economist, Office of Price Adm., 1417 N, St 
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Summary. Several statistical techniques are proposed for economically ana- 
lyzing large masses of data by means of punched-card equipment ; most of these 
techniques require only a counting sorter. The methods proposed are de- 
signed especially for situations where data are inexpensive compared to the 
cost of analysis by means of statistically “efficient” or “most powerful” pro- 
cedures. The principal technique is the use of functions of order statistics, 
which we call systematic statistics. 

It is demonstrated that certain order statistics are asymptotically jointly 
distributed according to the normal multivariate law. 

For large samples drawn from normally distributed variables we describe 
and give the efficiencies of rapid methods: 

i) for estimating the mean by using 1, 2, • • • , 10 suitably chosen order 
statistics; (cf p. 386) 

n) for estimating the standard deviation by using 2, 4, or 8 suitably chosen 
order statistics; (cf. p. 389) 

iii) for estimating the correlation coefficient whether other parameters of the 
normal bivariate distribution are known or not (three sorting and three 
counting operations are involved) (cf. p. 394). 

The efficiencies of procedures ii) and iii) are compared with the efficiencies of 
other estimates which do not involve sums of squares or products 

1. Introduction. The purpose of this paper is to contribute some results 
concerning the use of order statistics in the statistical analysis of large masses 
of data. The present results deal particularly with estimation when normally 
distributed variables are present. Solutions to all problems considered have 
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been especially designed for use with punched-card equipment although for 
most of the results a counting sorter is adequate. 

Until recently mathematical statisticians hftye spent a great deal of effort 
developing “efficient statistics” and “most powerful tests." This concentration 
of effort has often led to neglect of questions of economy. Indeed some may 
have confused the meaning of technical statistical terms “efficient" and “ef- 
ficiency” with the layman’s concept of their meaning. No matter how much 
energetic activity is put into analysis and computation, it seems reasonable to 
inquire whether the output of information is comparable in value to the input 
measured in dollars, man-hours, or otherwise. Alternatively we may inquire 
whether comparable results could have been obtained by smaller expenditures 
In some fields where statistics is widely used, the collection of large masses of 
data is inexpensive compared to the cost of analysis. Often the value of the 
statistical information gleaned from the sample decreases rapidly as the time 
between collection of data and action on their interpretation increases. Under 
these conditions, it is important to have quick, inexpensive methods for analyzing 
data, because economy demands militate against the use of lengthy, costly 
(even if more precise) statistical methods. A good example of a practical 
alternative is given by the control chart method in the field of industrial quality 
control. The sample range rather than the sample standard deviation is used 
almost invariably in spite of its larger variance. One reason is that, after brief 
training, persons with slight arithmetical knowledge can compute the range 
quickly and accurately, while the more complicated formula for the sample 
standard deviation would create a permanent stumbling block. Largely as a 
result of simplifying and routmizing statistical methods, industry now handles 
large masses of data on production adequately and profitably. Although the 
sample standard deviation can give a statistically more efficient estimate of the 
population standard deviation, if collection of data is inexpensive compared to 
cost of analysis and users can compute a dozen ranges to one standard deviation, 
it is easy to see that economy lies with the less efficient statistic. 

It should not he thought that inefficient statistics are being recommended for 
all situations There are many cases where observations are very expensive, 
and obtaining a few more would entail great delay. Examples of this situation 
arise m agricultural experiments, where it often takes a season to get a set of 
observations, and where each observation is very expensive. In such cases the 
experimenters want to squeeze every drop of information out of their data. 
In these situations inefficient statistics would bo uneconomical, and are not 
recommended. 

A situation that often arises is that data are acquired in the natural course of 
administration of an organization . These data arc filed away until the accumula- 
tion becomes mountainous. From time to time questions arise which can lie 
answered by reference to the accumulated information . How much of these data 
will be used m the construction of say, estimates of parameters, depends on the 
precision desired for the answer. It will however often be less expensive to 
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get the desired precision by increasing the sample size by dipping deeper into 
the stock of data in the files, and using crude techniques of analysis, than to 
attain the required precision by restricting the sample size to the minimum 
necessary for use with “efficient” statistics. 

It will often happen m other fields such as educational testing that it is less 
expensive to gather enough data to make the analysis by crude methods suf- 
ficiently precise, than to use the minimum sample sizes required by more refined 
methods. In some cases, as a result of the type of operation being carried out 
sample sizes are more than adequate for the purposes of estimation and testing 
significance. The experimenters have little interest in milking the last drop of 
information out of their data. Under these circumstances statistical workers 
would be glad to forsake the usual methods of analysis for rapid, inexpensive 
techniques that would offer adequate information^ but for many problems such 
techniques are not available. 

In the present paper several such techniques will be developed. For the 
most part we shall consider statistical methods which are applicable to estimating 
parameters. In a later paper we intend to consider some useful “inefficient” 
tests of significance. 


2. Order statistics. If a sample 0 n = x[ , x'i , ■ • • , x' n of size n is drawn from 
a continuous probability density function f{x), We may rearrange and renumber 
the observations within the sample so that 


( 1 ) 


Xi < x 2 < • ■ • < x„ 


(the occurrence of equalities is not considered because continuity implies zero 
probability for such events). The s.’s are sometimes called order statistics. 
On occasion we write x(i) rather than x, . Throughout this paper the use of 
primes on subscripted x’s indicates that the observations are taken without 
regard to order, while unprimed subscripted x’s indicate that the observations 
are order statistics satisfying (1). Similarly x(n,) will represent the n,th order 
statistic, while x'(n,) would represent the n.th observation, if the observations 
were numbered in some random order. The notation here is essentially the 
opposite of usual usage, in which attention is called to the order statistics by 
the device of primes or the introduction of a new letter. The present reversal 
of usage seems justified by the viewpoint of the article — that in the problems 
under consideration the use of order statistics is the natural procedure. 

An example of a useful order statistic is the median; when n = 2m + 1 (m = 
0, 1, • • ■ )i Zm+i is called the median and may be used to estimate the population 
median, i.e. u defined by 



•/—BO 


dt 


— i 


2 - 


In the case of symmetric distributions, the population mean coincides with u 
and x m+ i will be an unbiased estimate of it as well. When n = 2m (m = 1, 2, 
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• ■ ■), the median is often defined as \(x m + s m+ i). The median so defined is 
an unbiased estimate of the population median in the case of symmetric dis- 
tributions; however for most asymmetric distributions \(x m -f- x w+ i) -will only 
be unbiased asymptotically, that is in the limit as n increases without bound. 
For another definition of the sample median see Jackson [8, 1921], When x is 
distributed according to the normal distribution 


N( x, a, a 2 ) = 


- _L— -u/vpci-fl)* 

V2^ e 


the variance of the median is well known to tend to v o' fin as n increases. 

It is doubtful whether we can accurately credit anyone with the introduction 
of the median. However for some of the results in the theory of order statistics 
it is easier to give credit. In this section we will restrict the discussion to the 
order statistics themselves, as opposed to the general class of statistics, such as 
the range (*„ — xi), which are derived from order statistics We shall call 
the general class of statistics which are derived from order statistics, and use 
the value ordering (1) m their construction, systematic statistics. 

The large sample distribution of extreme values (examples x r , x„_, + i for r, s 
fixed and n — * « ) has been considered by Tippett [17, 1925] in connection with 
the range of samples drawn from normal populations; by Fisher and Tippett 
[3, 1928] in an attempt to close the gap between the limiting form of the dis- 
tribution and results tabled by Tippett [17], by Gumbel [5, 1934] (and in many 
other papers, a large bibliography is available in [6, Gumbel 1939]), who dealt 
with the more general case r > 1, while the others mentioned considered the 
special case of r = 1, and by Smirnoff who considers the general case of x r , 
in [15, 1935] and also [16] the limiting form of the joint distribution of x r , x , , 
for r and s fixed as n — ► <» . 

In the present paper we shall not usually be concerned with the distribution 
of extreme values, but shall rather be considering the limiting form of the joint 
distribution of x(n 2 ), x(n k ), satisfying 

Condition 1. Jim — = X, ; i = 1, 2, ■ • ■ , k ; 

n - *oo 71 


Xi < Xj < ■ • • < Xfc . 


In other words the proportion of observations less than or equal to x(n,) tends 
to a fixed proportion which is bounded away from 0 and 1 as n increases. K. 
Pearson [13, 1920] supplies the information necessary to obtain the limiting 
distribution of x(fti), and limiting joint distribution of x{ni), x(n 2 ). Smirnoff 
gives more rigorous derivations of the limiting form of the marginal distribution 
of the x(tti) [15, 1935] and the limiting form of the joint distribution of x(n,) 
and x(n } ) [16] under rather general conditions. Kendall [10, 1943, pp. 211-14] 
gives a demonstration leading to the limiting form of the joint distribution. 

Since we will be concerned with statements about the asymptotic properties 
of the distributions of certain statistics, it may be useful to include a short dis- 
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cussion of their implications both practical and theoretical. If we have a 
statistic $( 0 „) based on a sample 0 n : x[,x 2 , • ■ ■ , x' n drawn from a population 
with cumulative distribution function F(x) it often happens that the function 
($ — 0 )/cr n = y n , where <r„ is a function of n is such that 

(A) lim P(y n < t) = - 7 = [ e~ ix ‘ dx. 

When this condition (A) is satisfied we often say: 6 is asymptotically normally 
distributed with mean 9 and variance a\. We will not be in error if we use the 
statement in italics provided we interpret it as synonymous with (A). How- 
ever there are some pitfalls which must be avoided In the first place condition 
(A) may be true even if the distribution function of y n , or of 0, has no moments 
even of fractional orders for any n. Consequently we do not imply by the itali- 
cized statement that lim E\9(0 n )\ = 9, nor that lim {[E(0 l ) — [E(Q)f\ = 

n— * oo n — *00 

tin , for, as mentioned, these expressions need not exist for (A) to be true. In- 
deed we shall demonstrate that Condition (A) is satisfied for certain statistics 
even if their distribution functions are as momentless as the startling distribu- 
tions constructed by Brown and Tukey [1, 1946], Of course it may be the case 
that all moments of the distribution of 6 exist and converge as n — * a> to the 
moments of a normal distribution with mean 9 and variance <r 2 n . Since this 
implies (A), but not conversely, this is a stronger convergence condition than 

(A) . (See for example J H. Curtiss [ 2 , 1942].) However the important im- 
plication of (A) is that for sufficiently large n each percentage point of the 
distribution of 9 will bo as close as we please to the value which we would compute 
from a normal distribution with mean 6 and variance <r 2 n , independent of whether 
the distribution of 9 has these moments or not. 

Similarly if we have several statistics , 0 2 , ■■■, Ok , each depending upon 
the sample O n : x { , x 2 , • • , x ' n , we shall say that the 0, are asymptotically jointly 
normally distributed with means 9, , variances o\(n), and covariances p t] a t a, , when 

lim P(yi < ti, 3/2 < < 2 , • • • , Vk < t k ) 

n-+ 00 

(B) r‘i r*i r*t 

= K / • ■ • / e dxi dx 2 • • ■ dxk, 

where y t = ( 0 , — 0 t ) /<r, , and Q 2 is the quadratic form associated with a set of 
k jointly normally distributed variables with variances unity and covariances 
p, s , and K is a normalizing constant. Once again the statistics 0 < may not 
have moments or product moments, the point that interests us is that the 
probability that the point with coordinates ($ 1 , 0 2 , ■ ■ • , 9j) falls in a certain 
region in a fc-dimensional space can be given as accurately as we please for 
sufficiently large samples by the right side of (B) . 

Since the practicing statistician is very often really interested in the prob- 
ability that a point will fall in a particular region, rather than in the variance 
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or standard deviation of the distribution itself, the concepts of asymptotic 
normality given m (A) and (B) will usually not have unfortunate consequences. 
For example, the practicing statistician will usually be grateful that the sample 
size can be made sufficiently large that the probability of a statistic falling into 
a certain small interval can be made as near unity as he pleases, and will not 
usually be concerned with the fact that, say, the variance of the statistic may 
be unbounded. 

Of course, a very real question may arise: how large must n be so that the 
probability of a statistic falling within a particular interval can be sufficiently 
closely approximated by the asymptotic formulas? If in any particular case 
the sample size must be ridiculously large, asymptotic theory loses much of its 
practical value. However for statistics of the type we shall usually discuss, 
computation has indicated that m many cases the asymptotic theory holds 
very well for quite small samples 

For the demonstration of the joint asymptotic normality of several order 
statistics we shall use the following two lemmas. 

Lemma 1. If a random variable $(0„) is asymptotically normally distributed 
converging stochastically to 8, and has asymptotic variance <r ! (n) —* 0, where n 

n — kjo 

is the size of the sample 0„ ; x[ , x't , ■ ■ , x n , drawn from the probability density 
function h(x), and g0) is a single-valued function with a nonvanishing continuous 
derivative g'0) in the neighborhood of Q — 8, then gift) is asymptotically normally 
distributed converging stochastically to g(0) with asymptotic variance (O)] 2 

Proof. By the conditions of the lemma 


lim P 

ft-* co 




1 

y/ 2ir 



e iu2 du. 


Now if <<r„ = Ad, Ad — 0 — 8 , using the mean value theorem there is a 0i in 
the interval [0, 6], such that 


90) = 90) + 0- 9)908,), 

which implies 

lim P (til < t) = lim P < t) , 9'0i) rt 0, 

where Qi is a function of n. However lim g'(0i) = g f (6) so we may write 

lim P (tzJ < t\ - lim P ( 9 - { 6) - ~- e j e) < t) , g'{6) * 0. 

\ Gn / ?i-*oo \ G n Q \y) / 


where the form of the expression on the right is the one required to complete 
the proof of the lemma. 

Of course if we have several random variables 9,, h, • • • , 6k , we can prove 
by an almost identical argument that 

Lemma 2. If the random variables 0,(0 n ) are asymptotically jointly normally 
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distributed converging stochastically to 6, , and have asymptotic variances <j\ (n) — > 0, 

n— »oo 

and covariances pi } acn , , where, n is the size of the sample O n : x\ , x 2 , ■ ■ , x„ drawn 
from the probability density function h(x), and 0 ,( 0 ,), i = 1, 2, ■ • , k, are single- 
valued functions with nonvanishing continuous derivatives g[0i) vn the neighbor- 
hood of 0, = 0, , then the g t (6i) are jointly asymptotically normally distributed with 
means g^di), variances <r\[g[(f)i)f and covariances p lJ ir l <r ] g % (d,)g , ] (8j) 

The following condition represents restrictions on the probability density 
function f(x) sufficient for the derivation of the limiting form of the joint dis- 
tribution of the x(n t ) satisfying Condition 1 
Condition 2. The probability density function f(x) is continuous, and does not 
vanish m the neighborhood of u, , where 

/ f(x) dx = X, , i = 1, 2, • • • , k. 

J— CO 


If we recall the discussion of condition ( B ) above, the theorem of Pearson 
and Smirnoff may be stated: 

Theorem 1. If a sample O n : x i , x 2 , ■ * • , x n is drawn from f(x) satisfying 
Condition 2, and if x(ni), x(n 2 ) satisfy Condition 1 as n — > then x(ni), x(n 2 ) 

are asymptotically distributed according to the normal bivariate distribution with 
means ui , , 

r u » 

I f(x) dx = X, , 

J—a o 

and variances 


and, covariance 


2 _ X,(l - X.) 

1 n[f(u,)]* ’ 


* = 1 , 2 , 


_ Ml - x 2 ) 

Pl20lCra n/(«0/(«0 ' 


Theorem 1 has an obvious generalization which seems not to have been carried 
out in the literature The generalization may be stated: 

Theorem 2 If a sample O n : x \ , x 2 , • • •, x„ is drawn from f(x) satisfying 
Condition 2, and if x(ni), x(n 2 ), • ■ ■ , x(nt) satisfy Condition 1 as n — * <*>, then 
the x(n t ), i = 1, 2, k, are asymptotically distributed according to the nor- 

mal multivariate distribution, with means u, , 


f f(x) dx = X v , 

J—aj 


and variances 


2 Mi - X.) 

O I = 


nf(u t y 


i = 1 , 2 , • * • , k, 
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and covariances 


_ X,(l - X,) 
P,ia<a ’ nf{U;)j(Uj) ' 


1 < i < j < k. 


fix) = 


Proof. We shall carry out the demonstration for the uniform distribution 

[l, 0 < x < 1, 

Jo, elsewhere, 

and then utilize the fact that by a suitable transformation of the uniform dis- 
tribution we may get any f(x) satisfying Condition 2. Of course for the par- 
ticular case of the uniform distribution all moments of the x(n{) exist and con- 
verge to those of the asymptotic theory. 

The joint probability density of the x(n,), satisfying Condition 1 and drawn 
from fix), is given by 

n\ 

p[z(ni), x{rh), , x(n k )] = 


( 2 ) 


Ok - 1} 1 (» - n*) 1 II Ok ~ «.-i - 1) 1 

•-2 

I *) (LV> slL,r! 


Performing the indicated integrations we get from the right of (2) 


(3) 


taOO" 1 ' 1 n fcOO - 


i-2 


where G is the multinomial coefficient on the right of (2). It is well known 
that for the uniform distribution i?[a;0k)] = » or asymptotically-^, i = 

1, 2, ■ • -,k. We make the transformation y % = (x(n,) — Vn, leading to 


(4) 




- , [y> - 


+ 


V 


y.-il V 
n J 


f n — n k _ j/b_y n *. 
\ n V«/ 
Using the usual technique of factoring out expressions like 

we rewrite (4) with Ct as a new constant, and setting X< = — 


( 5 ) 


(* + Wn) 


A ( x , 1) y*-"*-!- 1 / y* V~* 

<-i V (X, — X,_i)Vn/ \ (1 — X») Vw/ 
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Now taking the logarithm of (5) , expanding, neglecting terms 0 1 and higher, 

collecting terms and taking the antilogarithm we get the approximate asymp- 
totic distribution of the order statistics 


( 6 ) 


g(x(ni), x{rh), • • • , x{n k )) 


= C* exp 



\i+i ~ X,_i 
(X<+i — X.)(X» — Xf-i) 



Vi y.-i V 
x< - X.-i/J ' 


where^Xo = 0, \k+i = 1. Now setting up the matrix of the coefficients of the 
quadratic expression in the exponent 


A = X,+i — X1-1 „ 

“ (X f+1 - X,)(X, - X .--0 ’ 

i = 1, 2, • • • , k) A„ = 0, | i - 3 | > 1. 
iances we need 


A,,, — I A,— 1 ., 


X,- — Xi— 1 ’ 


To obtain the variances and covar- 


_ cofactor of A,, in || A X1 || 
determinant A,j 


(see for example Wilks [18, p. 63 et seq.j). Now 


( 7 ) 


t+x 1 

| A | = determinant Atj = H C ; — > 

1 X, *“ Xi-i 


cofactor of An = X,(l — X.) J A |, i = 


cofactor of A,-,- 


X.(l - X/) I A I, 
X/(l - X,) I A I, 


1 , 2 , 
i < j 
j < i. 


k. 


This completes the proof for the uniform distribution. 

If the uniform distribution is transformed into a probability density function 
f{x) satisfying Condition 2, by an order preserving transformation, we appeal 
to Lemma 2. We notice that the x(n.) are transformed into g[x(n t )], and that 
the probability that x(n t ) falls in the interval [w, , u, + A«<] is transformed into 
the probability that g[a:(n,)J falls m the interval [g(u x ), g(U{ + Am)]. Using 
the mean value theorem we may write 

g(Ui + An,) = g(u t ) + Au,g'(u' t ), 
where u, lies in the interval [it, , u, + Am]. However 

lim g'{u\) = g'(u t ). 

The density for the uniform distribution in the interval [m , m + Am] is just 
Am, and this same density will tend to /(m)Amf?'(m). Therefore g'{u,) = 
l/f(ui), which completes the proof of Theorem 2. 

It would often be useful to know the small sample distribution of the order 
statistics, particularly in the case where the sample is drawn from a normal. 
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Fisher and Yates’ tables [4] give the expected values of the order statistics up 
to samples of size 50 However it would be very useful in the development 
of certain small sample statistics to have further information. It is perhaps too 
much to expect tabulated distribution functions, but at least the variances 
and covariances would be useful. A joint effort has resulted in the calculation 
for samples n = 2, 3, • •, 10 of the expected values to five decimal places, 
the variances to four decimal places, and the covariances to nearly two decimal 
places. It is expected that these tables will be published shortly. 

3. Estimates of the mean of a normal distribution. It will be important 

in what follows to define efficiency and to indicate its interpretation. Then we 
shall construct some estimates of the means of certain distributions and compute 
their efficiencies. Except for the tables given, the discussion is applicable to 
the estimation of the mean of any symmetric distribution; and, of course, the 
concept of efficiency is still more general in its application. A statistic 
where O n is the sample, is said to be an efficient estimate of 9 if 

i) \/n 0 — 9) is asymptotically normally distributed with zero mean and 
finite variance, <r 2 (0), and 

ii) for any other statistic 6’ with \/n0' — 0) asymptotically normally dis- 
tributed with zero mean and variance a0'), a 2 0) < ^0') 

The ratio a0)/<r 2 0') is termed the efficiency of 9' if 9 is an efficient estimate 
of 9. For discussion see Wilks [18, 1943], The concepts of efficient statistic 
or estimate and of efficiency were introduced by R. A. Fisher They serve as 
one measure of the amount of information a statistic draws from a sample. 
It is also common practice to speak of relative efficiencies, for example, of the 
statistics 9’ and 9" described in ii) above, we say if rr 2 0') < a0") that the 
efficiency of 6" relative to 9‘ is the ratio of the smaller variance to the larger. 
This concept of efficiency has sometimes been used when the normality assump- 
tion has been violated by one or both statistics, when one or both are biased, 
and when small samples are considered. When used under these conditions 
the concept of efficiency becomes more difficult to interpret, although a compari- 
son of the variation of two statistics about the value they are commonly esti- 
mating is often of value 

In the case of estimates of the mean a of a variable which is normally dis- 
tributed according to N(x, a , a 1 ) from a sample of n, we can often express the 
variance of an asymptotically unbiased estimate as <r 2 0 x ) = h^/n. The sample 
mean 9 = 2x,/n is an efficient estimate of a with variance a fn. Then in such 
cases the efficiency of 0, in estimating a is l//c, . The interpretation is merely 
that to obtain the same precision using 9, as is possible with 9, one must use 
a sample k, times as large 

Bearing in mmd that we are at present searching for economical methods 
for analyzing large samples, it is clear that the concept of efficiency offers us a 
practical way of comparing cost of information with cost of obtaining it. 
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In the present section and in sections 4 and 5 we shall develop certain sys- 
tematic estimates of parameters of normally distributed variables. Our pro- 
cedure then will be to compare the efficiency of the systematic estimates with 
the efficient statistic for estimating the parameter in question, and also in sec- 
tions 4 and 5 we compare our estimates with a statistic not involving squares 
or products. Of course the efficient statistic for estimating the mean of a normal 
is the sample mean, therefore in this section we will only compare our estimates 
with the sample mean. 

We can construct unbiased estimates of the mean of a normal distribution 
from linear combinations of suitably chosen order statistics. These systematic 
statistics will be asymptotically normally distributed if the order statistics 
from which they are derived satisfy Condition 1. We will restrict ourselves to 
a useful practical case where equal weights are used. In other words the esti- 
mate discussed is just the average of k order statistics AT 1 2 x(n,). Suppose 
i = 1,2, ■ ■ • , k satisfy Condition 1, that E[x(ni)] = E[x(nk-,+i)], so that 
®[2a;(n,)] = a. An important unsolved question is to discover what spacing 
of the x(n,) will yield minimum variance, and thereafter at what rate does the 
efficiency of this optimumly spaced estimate increase with k. Computational 
methods bog down rapidly after k = 3. Because so little is known about this 
problem it seems worthwhile to offer some results for three arbitrary spacings 
(these results are of course useful in analyzing data). 

If the x(n t ) satisfy Theorem 2 we may approximate the variance cf the sys- 
tematic statistic 8 k = Xx(n l )/k by the usual formula 

(8) <x\8 k ) = E[Zx{n t )/kf - [E(2x(n x )/k)t 


We lose no generality by assuming the mean and variance of the underlying 
normal to be 0 and 1 respectively Then using the fact that 2 u, = 0, and 
the result of Theorem 1 we rewrite (8) as 


(9) <r 2 (flt) = E[I,(x(n,) - u>)/kf 


i r k 

= JL y 

fc 2 ra|_f=i 


X»(l X,) | 2 ^ X.(l 


- X,-) 




fj, 


if 


where f m = 

Using the symmetry which makes X, = 1 — X*_,' +1 , /, = fh-i+i , and the fact 
that for fc = 2r + 1, / r +i = l/V^ir, X r+ i = i, we may simplify the right side 
of equation (9) with the following results for fc = 1, 2, ■ ■ ■ , 7. The factor 1/fc 2 
has not been disturbed. We also write the general formulas for the simplified 
form of (9), but we omit a rather lengthy combinatorial argument which es- 
tablishes the generalization 


k = 1: 


IT 

2 n 


k = 2 : 


2Xi 

4n/5 
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- 3 * i. r - + Xi 72 ^ 


9 n L/f 


fi 


+ 


1] 


( 10 ) 


k = 4: 


k = 5: 


k = 6: 


k = 7: 


2_ rxx 

16» J 


i /i/s 


2Xi X2 


2 r\i 2Xi Xj /— /Xi Xa\ 7r"l 

s;li: + w. + /l + vS (7. + 7.j + iJ 

2 rXi X2 Xi 2Xi 2Xi 2Xa ”1 
l6n L/l + /i + /? + /i/a + /i/a + /*/. J 


36n 

±[ h 

49n [_/ 


Xs 

71 


Xa 

7! 


2X1 

fill 


* + ii + 7* + ri + tt + 


2Xi 

/l/s 


2X a 

/ 2 /a 


+ 


^0: + r;7) + i] 


k = 2 r: 


_JL_ r F 4- 2 F — 

(2r) 2 nL^/r7 5 ,r^//J’ 


r > 1 


fc = 2r + 1: 


2 [ (2r) 3 

(2 r 4- l) 2 n L 2 


4r) + 'V / 27r -j + 

1-1 Jt 


j]' 


r > 1. 


In addition to the possibility of minimizing the equations of (10) by numerical 
methods, three other procedures suggest themselves: i) to space the order 
statistics uniformly in probability; ii) to choose those k order statistics whose 
expected values are equal to the expected values of the order statistics in a 
sample of size k drawn from a unit normal; hi) to choose X, = (i — \)/k. The 
following table lists for k = 1,2, and 3 the expected values u> of the order sta- 
tistics and the probability to the left of the expected values X< for each of the 
procedures. The chosen order statistics are counted from left to right. It 
will be noticed that the third method gives very good results, and has the value 
of simplicity of formula. The following table gives a comparison between the 
efficiencies resulting from spacing by the three methods. The three optimum 
cases are included for completeness. 

Statisticians planning to use the method of expected values suggested above 
will find Fisher and Yates [4, 1943] table of the expected values of the order 
statistics in samples of size k drawn from a unit normal helpful for computing 
the X, . Alternatively the following table of X; might be used. 

As an example of the use of Table III, suppose we are using the expected 
value method for estimating the mean of a large sample drawn from a normal 
distribution N(x, a, <r 2 ). If we are willing to use 6 observations out of 1000 for 
this purpose Table III indicates the selection of % , a^si , zm , xsw 1 Ztm > £ses • 
Furthermore Table II indicates that the variance of the estimate of o based 
on the average of these six observations will be approximately a 2 / .918n, n = 1000. 
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4. Estimates of the standard deviation. The statistic 

s 2 = S & ~ *)*/(» - 1), 

i-i 


TABLE I 


Comparison of the order statistics which would be chosen according to each of the 
four procedures for subsamples of k = 1, 2, 3 


k 

Order 

Statistic 

Optimum. 

Equal 

Probability 

Expected Values 

X,=(i— 

m 

u t 

X, 

u. 

X. 

w. 

X. 

a. 

Xi 

1 

First 









2 

First 

-.6121 


Blip 

.3333 

-.5642 

.2863 

-.6745 

mi 


Second 

.6121 

.7298 


.6667 

.5642 

.7137 

.6745 


3 

First 

im 

.1826 

-.6745 


-.8463 

.1967 

-.9674 

.1667 


Second 

■ 









Third 

.9056 

.8174 

.6745 


.8463 


.9674 

.8333 


TABLE II 

Comparison of the efficiencies of four methods of spacing k order statistics used 
in the construction of an estimate of the mean 


k 

X. = i/(fc+l) 

Expected 

Values* 

Xi=»(i— i)/k 

Optimum 

i 

.637 

.637 

.637 

.637 

2 

.793 

.809 

.808 

.810 

3 

.860 

.878 

.878 

.879 

4 

.896 

.914 

913 


5 

918 

.933 

934 


6 

.933 

.948 

.948 


7 

.944 

.956 

.957 


8 

.952 

.963 

.963 


9 

.957 

.968 

.969 


10 

.962 

.972 

.973 



* The are chosen equal to the expected values of the order statistics of a sample of 
size k. 


where x = T^T-i x' t /n-iB well known to be an unbiased estimate of the popula- 
tion variance <r 2 , for n > 1. However s is not in general an unbiased estimate 
of a. We are not interested here in the question of when we should estimate a 
and when it is more advantageous to estimate <r s . All we want is to have an 
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unbiased estimate of a, based on sums of squares, to compare with another 
unbiased estimate based on order statistics. In the case of observations drawn 
from a normal distribution 


(ID 


, = (frofrgtw - ip 

r(*n) 



) 


is an unbiased estimate of <r (sec for example Kenney [11], with variance 


( 12 ) 


*V) 


" r(H» ~ i]) 

. r(^n) 



- 1 ) - 1 


2 

a . 


TABLE III* 

P(x < Mf|») X 10 4 , «, It = E(x t ,k), x,\k is the ith order statistic in a sample of 
size k drawn from a normal distribution N (x, 0,1 ) 


X 

1 

2 

3 

4 

S 

6 

■ 

8 

9 

1 










2 

2863 

7137 








3 

1987 


8013 







4 

1516 

3832 

6168 

8484 






5 

1224 

3103 

5000 

6897 

8776 





6 

1025 


4201 

5799 

7395 

8975 




7 

Mm 

2244 

3622 


6378 

7756 

9119 



8 

EH 

1971 

3182 



6818 

8030 

9227 


9 

EH 

1756 

2837 

3919 



7163 

8244 

9312 


■ 

1584 

2559 

3536 

4512 

5488 

6464 

7441 

7416 


* The table is given to more places than necessary for the purpose suggested because it 
may be of interest in other applications The E(x t |t) from which the table was derived 
were computed to five decimal places 

For most practical purposes however, when n > 10, the bias in s is negligible. 
For large samples <r J (s') approaches <r 2 /2 n. 


4A, The range as an estimate of a. As mentioned in the Introduction, 

section 1, it is now common practice in industry to estimate the standard devia- 
tion by means of a multiple of the range TV = c„(x„ — ah), for small samples, 
where c„ = l/[E(y„) — E(yf)], y n and y\ being the greatest and least observations 
drawn from a sample of size n from a normal distribution N (y, a, 1) . Although 
we are principally interested in large sample statistics, for the sake of complete- 
ness, we shall include a few remarks about the use of the range in small samples. 

Now R' is an unbiased estimate of a, and its variance may be computed for 
small samples, see for example Hartley [7, 1942]. In the present case, although 
both R' and s' are unbiased estimates of a, they are not normally distributed, 
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nor are wc considering their asymptotic properties; therefore the previously 
defined concept of efficiency does not apply. We may however use the ratio 
of the variances as an arbitrary measure of the relative precision of the two 
statistics. The following table lists the ratio of the variances of the two sta- 
tistics, as well as the variances themselves expressed as a multiple of the popu- 
lation variance for samples of size n = 2, 3, • ■ 10. 


4B. Quasi ranges for estimating a. The fact that the ratio a l {s l )/a\R') 
falls off in Table IV as n increases makes it reasonable to inquire whether it 
might not be worthwhile to change the systematic estimate slightly by using 
the statistic Ci[„|z„_i — x*], or more generally c r |„[x„_ r — av+J where e, |„ is the 
multiplicative constant which makes the expression an unbiased estimate of a 
(in particular c T \ n is the constant to be used when we count in r + 1 observations 
from each end of a sample of size n, thus c r |„ = l/[E(j/ n _ r — y r +i)] where the 

TABLE IV 


Relative precision of s’ and R', and their variances expressed as a multiple of <r 2 , 

the population variance 


n 

„*(«')/*>(«') 

<r*(«')/<r a 


2 

1.000 

.570 

.570 

3 

.990 

.273 

.276 

4 

.977 

.178 

.182 

s 

.962 

.132 

.137 

0 

.932 

.104 

.112 

7 

.910 

.0864 

.0949 

8 

.889 

.0738 

.0830 

9 

.869 

.0643 

.0740 

10 

.851 

.0570 

.0670 


j/’s are drawn from N (y, a, 1)) . This is certainly the case for large values of n, 
but with the aid of the unpublished tables mentioned at the close of section 2, 
we can say that it seems not to be advantageous to use ci|„[x„_i — x 2 ] for n < 10. 
Indeed the variance cmo[x 9 — 12 ], for the unit normal seems to be about 10, 
as compared with v'(R') / cr“ = .067 as given by Table IV, for n = 10. The 
uncertainty in the above statements is due to a question of significant figures. 

Considerations which suggest constructing a statistic based on the difference 
of two order statistics which are not extreme values in small samples, weigh 
even more heavily in large samples. A reasonable estimate of <r for normal 
distnbutions, which could be calculated rapidly by means of punched-card 
equipment is 


[zCnO - x(ui)], 


(13) 


a = 
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where the x(n{) satisfy Condition 1, and where c = uz — Ui , xh and ui are the 
expected values of the th and ru order statistics of a sample of size n drawn from 
a unit normal. Without loss of generality we shall assume the xt are drawn 


from a unit normal. Furthermore we let — = X 2 = 1 — Ai = 1 — — . Of 

n n 

course a will be asymptotically normally distributed, with variance 




= 2 _ “. 
nc 2 _ 


Xi(l - Xi) + x a (l - X s ) 


2Ax(l - h) 




{U) wv " nc*l [/(«:)]* ' [/(«,)]* 

Because of symmetry f{u i) = j{xiz) ; using this and the fact that Xi = 1 — Aj , 
we can reduce (14) to 


(15) 


_ 2 A x (l - 2Ai) 
nc 5 [/(«!)]* * 


We are interested in optimum spacing in the minimum variance sense. The 
minimum for c{ti) occurs when Ai * .0694, and for that value of Ai , <r 2 (S) =*> 
.767 <r 2 /n. Asymptotically s' is also normally distributed, with <r 2 (a') = <r 2 /2n. 
Therefore we may speak of the efficiency of & as an estimate of a as .662. It is 
useful to know that the graph of a($i) is very flat in the neighborhood of the 
minimum, and therefore varying Xi by .01 o'r .02 will make little difference m 
the efficiency of the estimate a (providing of course that c is appropriately 
adjusted). K. Pearson [13] suggested this estimate in 1920. It is amazing that 
with punched-card equipment available it is practically never used when the 
appropriate conditions described in the Introduction are present. 

The occasionally used semi-interquartile range, defined by Xi = .25 has an 
efficiency of only .37 and an efficiency relative to $■ of only .56. 

As in the case of the estimate of the mean by systematic statistics, it is per- 
tinent to inquire what advantage may be gained by using more order statistics 
in the construction of the estimate of a. If we construct an estimate based on 
four order statistics, and then minimize the variance, it is clear that the extreme 
pair of observations will be pushed still further out into the tails of the dis- 
tribution. This is unsatisfactory from two pdints of view in practice: i) we will 
not actually have an infinite number of observations, therefore the approxima- 
tion concerning the normality of the order statistics may not be adequate if Aj 
is too small, even in the presence of truly normal data; ii) the distribution 
functions met in practice often do not satisfy the required assumption of norm- 
ality, although over the central portion of the function containing most of the 
probability, say except for the 6% in each tail normality may be a good approxi- 
mation. In view of these two points it seems preferable to change the question 
slightly and ask what advantage will accrue from holding two observations at 
the optimum values just discussed (say Aj, = ,07, Xi = .93) and introducing 
two additional observations more centrally located. 

We define a new statistic 


(16) 


i' = [z(«i) + x(n s ) — x{rn) — z(ni)], 

c 



“inefficient” statistics 


393 


c! = E[x(n.i) + x(n s ) — x(jh) — x(nf)], where the observations are drawn from 
a unit normal. We take Xi = 1 — X4 , X 2 = 1 — X 3 , Xx = .07. It turns out that 
c?{o') is minimized for X 2 in the neighborhood of .20, and that the efficiency com- 
pared with s' is a little more than .75. Thus an increase of two observations in 
the construction of our estimate of a increases the efficiency from .65 to .75. 
We get practically the same result for .16 < X 2 < .22. 

Furthermore, it turns out that using Xi = .02, X2 = .08, X a = .15, X< = .25, 
X 6 = .75, X a = .85, X7 = .92, Xs = .98, one can get an estimate of <r based on 
eight order statistics which has an efficiency of .896. This est ima te is more 
efficient than either the mean deviation about the mean or median for esti- 
mating a The estimate is of course 

o" = [i(n g ) + x(n 7 ) + x(n e ) 4- x(n s ) — x(n 4 ) - x(n 3 ) — x(n 2 ) - x(rh)]/C, 
where C = 10.34. 

To summarize: in estimating the standard deviation a of a normal distribution 
from a large sample of size n, an unbiased estimate of a- is 

O ~ ~ (.Xn— f+1 Xr") , 

c 


where c = F(y„-,+i — y T ) where the y’s are drawn from N(y, a, 1). The estimate 
<7 is asymptotically normally distributed with variance 


i,« _ 2 Xx(l - 2XQ 
W nr? [f(ui)Y ’ 

where Xi = r/n, f(u 1) = N(E(x r ), 0, <r 2 ). We minimize <r J (v) for large samples 
when Xi — .0694, and for that value of Xj , 


VoptW = 


■767 o 2 
n 


The unbiased estimate of <r 

G* ~ ~ (^n— r+l “f" *+l 2 t) 

C 

may be used in lieu of o. If Xi = r/n, X 2 = s/n we find 

o\V | Xi = .07, X s = .20) = — . 

n 

4C. The mean deviations about the mean and median. The next level of 

computational difficulty we might consider for the construction of an estimate 
of a is the process of addition. The mean deviation about the mean is a well 
known, but not often used statistic. It is defined by 

W 

m.d. = 2 I *v — * | /«. 
t-i 


( 17 ) 
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For large samples from a normal distribution the expected value of m.d. is 
- < 7 , therefore to obtain an unbiased estimate of <r we define the new statistic 


V? 


- v1 m ' 


d. Now for large samples A has variance ~ 2)]/n, or an 


efficiency of .884. However there are slight awkwardnesses in the computation 
of A which the mean deviation about the median does not have. 

It turns out that for samples of size n = 2m -f 1 drawn from a normal dis- 
tribution N ( y , a, 0) the statistic 

(18) M' = 
asymptotically has mean a and variance 

(19) AM') - 


Thus in estimating the standard deviation of a normal distribution from 
large samples we can get an efficiency of .65 by the judicious selection of two 
observations from the sample, an efficiency of .75 by using four observations, 
and an efficiency of .88 by using the mean deviation of all the observations from 
either the mean or the median of the sample, and an efficiency of .90 by using 
eight order statistics. 


5. Estimation of the correlation coefficient. In the present section we con- 
sider the estimation of the correlation coefficient of a normal bivariate population: 


( 20 ) 


fix, y) - 


Imax t \/ 1 — p* 


exp 


. 2(1 — p 1 ) ( 


(x - a)* . ( y - b Y 2p(x - a)(y - b) 


2 

Ox 


+ 


2 

<Ty 


(Tj (Jy 


)]• 


The efficient estimate of p in a sample O n : (%[ , y[), ■ ■ ■ , (** , y' n ) drawn 

from the density (20) is 

Hix[ - x')(yi - y) 


( 21 ) 


r = 




There are numerous other techniques in the literature for estimating p, among 
them i) the tetrachoric correlation coefficient which depends on a four-fold table, 
ii) the adjusted rank correlation coefficient which depends on assigning ranks to 
the x and y observations. These and other estimates of the correlation co- 
efficient are discussed by Kendall [10], 

We shall be concerned with the construction of some estimates of the cor- 
relation coefficient which are particularly adapted for use with punched-card 
equipment, A counting sorter is adequate for the first two cases discussed; 
in line with our previous development we shall then consider a technique which 
uses simple addition of the observed values, but does not require sums of squares 
or products (in the special case where variances of x and v are emmli 
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SA. Estimation of p when means and standard deviations are known. Let 

us suppose that the means and variances of the variables x and y, distributed 
according to (20) are given, and consider the problem of estimating the cor- 
relation coefficient p from a sample of size n. There will be no generality lost 
by assuming a = b = 0, a\ — = 1. The technique used will be to construct 

lines y = 0, x = ±fc, which cut the zy-plane into six parts. We will form an 
estimate of p based upon the number of observations falling in the four comers. 
Figure 1 represents the lines laid out in the manner suggested in connection with 
a scatter diagram of 25 observations ; naturally the method is recommended for 



l-a-lfffj. l = » a=a+ktt- J 

Fig. 1 Diagram of the Construction Described in Paragraph 5A with a Sample of 

25 Observations Superimposed 

use only with large samples, the 25 observations are for purposes of illustration 
only. More specifically after assigning the special values mentioned immedi- 
ately above to the means and variances in (20), we define 



i*«# AOQ 

pi = fix, y ) dx dy, 

JO Jk 

Pi = f f fix, y) dx dy, 

v— oo •*— oo 

(22) 

Vi = f [ fix, y) dx dy, 

J 0 J- oo 

Pi = [ f fix, y) dx dy, 

J— oo Jk 


PB = [ f fix, y) dx dy = 
J— oe J—h 

j^Nix, 0, 1) dx. 
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We denote by n, the number of observations falling into the region containing 
probability density p , . Of course n < = n - Now we may write the joint 
probability distribution of the n, as 

1 ^ 

(23) g(n h n s , n Sl m) = II P? ‘ ■ 

4 

remembering that ns = n — 2Z ni . 

i 


We shall now derive the maximum likelihood estimate of p from (23) . Taking 
the logarithm of (23) we have 

5 

(24) log g = log c + X) «. log Pi, 

i-i 


where c is the multinomial coefficient on the right of (23) . 
with respect to p gives 


(25) 


d(log g) _ y rij p, 

dp £i p. 


Dififerentiating (24) 


where pi = 

dp 


of course 


dps _ 
dp 


0 because ps is functionally independent of p. 


To get p, the maximum likelihood estimate of p, under our restrictions, we must 
equate the right of (25) to zero and solve for p. Before proceeding it will be 
useful to note the following relations: 


Pi = Pa i Pi = pt 

(26) Pi = -Pi ; p 2 = — Pa ; Px = p 3 ; p 2 = Pi 

+ Pi = N(x, 0, l)dx — A; p 2 -f p s = jf N(x, 0, 1 )dx = A. 


Pi 


If after making appropriate substitutions from (26) we set the right of (25) 
equal to zero we get 

fliPx _ »api ns pi _ n t p t _ Q 
Px A — pi Px A - pi ’ 

and since in general pi ^ 0, the condition is that 


(27) n-L + ns _ pi 

Ws 4" Ri A — pi 

Unless all four of the ni are zero (which is unlikely for reasonable values of A 
because n is large), it is possible to find a value of p which will make the right 
side of (27) equal to the ratio formed from the observations on the left, and 
the value of p so determined is the maximum likelihood estimate p under the 
restrictions we have imposed , In practice this equation may be solved by con- 
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suiting a table of the bivariate normal distribution — see for example K. Pearson 
[14]. Alternatively [27] may be solved by referring to Figure 3. Truman Kelley 
[9, 1939] has considered a closely related problem in connection with the valida- 
tion of test items. 

It may be inquired whether it would not be preferable to reduce the present 
design to a tetrachoric case by using only the cutting lines a; = 0, y = 0. An 
investigation of the variance of p reveals that such is not the case. We proceed 
to determine the asymptotic variance by means of the usual maximum likelihood 
technique. Differentiating (25) once more we have 


(28) 


<^(log g) = y n.Cp.pt ~ Pi) 

dp 2 p\ 


dr 

where pi = — . We note that E(n,) = np , , therefore 

dp 2 



but since the derivative of a sum is equal to the sum of its derivatives, and 
Pi + p* = X, Pi + P 3 = X, the first sum in the square brackets vanishes. Suit- 
able substitutions from (26) will reduce the second sum so that we get 


(30) 


E r rf*(log 9) 1 = 2npiX 
L dp 2 J pi(X - pi) ' 


Therefore asymptotically jo is normally distributed with variance 


(31) 


_ Pi(X - pi) _ 
- 2nXpi 


In general the optimum value (in the minimum variance sense) of X which deter- 
mines the cutting lines x = ±fc will depend on the true value of p. To carry 
out the minimization process in general will require fairly extensive computa- 
tions, which we feel would be justified. For the present we shall restrict our- 
selves to minimizing <r 2 (p) for the case p = 0. 

We have 

p 1= i. exp[ _p 2 ] = j=m. 

when p = 0, and pi = |X. This gives 

(32) v ( P j P = 0) = 4n ^p • 

We wish to minimize the expression on the right. We recall that a similar 
expression Xi //? was to be minimized in section 3 when the optimum pair of 
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observations for estimating the mean of a normal distribution was found. 
Using the previous results we have X = .2702, k ~ .6121; which gives us finally 

(33) <r 2 0pt (p!p = 0) * — . 

n 


To summarize: if a sample of size n is drawn from a normal bivariate popula- 
tion with known means a*, and variances al and a'y , but unknown correlation p, the 
maximum likelihood estimate of p based on the number of observations falling in 
the four corners of the plane determined by the lines x = a x ± ka x , y = a„ is found 
by solving for p the equation 

ri\ -f- nz Pj 

fti + ft2 + n 3 + »< X * 


where fti is the number of observations falling in the upper right, th in the upper 
left, n 3 m the lower left, m* in the lower right hand comer, and p, is the probability 
density in the region into which the w 3 fall, X = pi + p* ■ The variance of this 
estimate p is given by 

V-\ ~ Pi) 

* w " 

which is minimized for p — 0 by setting k = ,6121, X = .2702, giving 


<*opt (p i p — 0) 


1.939 

n 


On the other hand if the usual tetrachoric estimate is used with x — 0, y — 0 
as the cutting lines we get o> (p j p = 0) = r/kn. The relative efficiency of 
the tetrachoric compared with the optimum statistic is therefore .787. The 
variance of the efficient estimate r given in (25) when p — 0 is 1/n. Consequently 
the efficiency of our estimate p compared to that of r is about ,515 for the special 
case p = 0 under consideration. This means about twice as large a sample is 
required to get the same precision with p as with r. Doubling the sample and 
using the cruder statistic p may often be an economical procedure. 

It may be surmised that a still better estimate of p could be constructed by 
employing four cutting lines, say x = y = The simplifications which 

we used to obtain the estimate p no longer hold when we use this new construc- 
tion. However, it is still possible to compute the minimum variance of the 
new estimate which we will call p', for the special case p = 0. It again turns 
out that k = .6121 minimizes and we get 


( 34 ) 


2 /•/ | n \ 1.52 

v 0P t(p | P = 0) =** , 

ft 
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which makes the efficiency of p' (compared with r) about .66 as compared with 
515 for p. This suggests that if some very simple technique can be found for 
obtaining p', p' would be worth using. Uniortunately the author has not 
been able to construct a rapid way of finding p'. 


5B. Estimation of p when the parameters are unknown. A more practical 
situation than the case treated in paragraph 5A, is the case in which all param- 
eters of (20) are unknown. This case will be treated by means of order sta- 
tistics. We construct an order statistic analogue of the estimate p which we 
will call p*. In general the procedure will be as follows: Each of the N observa- 
tions in the sample has an x coordinate and a y coordinate 

i) order the observations with respect to the x coordinate; 

li) discard all observations except the n with the largest x coordinates called 
the right set and the n with the smallest x coordinates called the left set, retain- 
ing, therefore, 2 n observations, 

iii) order the pooled 2n observations with respect to the y coordinate; 

iv) break the 2n observations into two sets of n observations each; the upper 
set containing the n observations with the greatest y coordinates, and the 
lower set containing the n observations with the smallest y coordinates, 

v) reorder the upper set of observations with respect to the x coordinate; the 
n observations will be divided into those whose x coordinates belong to the 
right set and those whose x coordinates belong to the left set, 

vi) the estimate p* will be obtained by solving the equation 


(35) 




where n* is the number of observations in the upper set which are also numbers 

pCO 

of the right set and p* is / f(x, y)dx dy, while f(x, y) is the bivariate 

Jo Ji* 

r n * 

normal (20) with <r* = <r„ , = 1, a = 6 = 0, and / N(x, 0, 1) dx = — = \ x . 

Jk* A 


Figure 2 represents graphically the construction described above for a scatter 
diagram composed of 25 observations. Of course the number 25 is only for 
purposes of illustration, as the method is only proposed for use withlarge samples. 

The procedure of ordering the x’s and choosing the right and left sets of ob- 
servations is analogous to cutting the bivariate distribution by the two lines 
x = ±/c as described in paragraph 5 A, indeed x = x n+ i and x = xw-„ are the 
corresponding lines, but they vary from sample to sample. To continue the 
analogy, ordering the remaining observations with respect to y and dividing 
them into upper and lower sets of equal size is like cutting the plane ivith the 
line y = 0. Finally formula (35) is analogous to formula (27) Another similar 
change is that where formerly we had among relations (26) the equalities pi = 
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Vi , = P* i we now have the corresponding relations^amongst the number of 

observations in the four corners of the plane, namely m = n 3 , n* = n* which 


0 

n,*»2 

Q 

1 

0 

0 

0 

0 0 0 

0 

o • 

77777/77777777 

77776777777777 

7/77 777/77 /7 

0 

0 

0 

n/ = 4 

0 

0 

0 

0 

n/r=N-2n-2 all 

0 

nt=n 1 %n 4 , *as< 

0 


n»n a fn a =fl x,»x a lf &x n 

Fig. 2 . Diagram of the Construction Described in Paragraph SB on the Basis of 26 

Observations n = 6 

can readily be seen by inspection of the fourfold table we have constructed below 
(omitting all reference to AT" — 2 n pairs of observations we have discarded). 



Left set 

Right set 

Totals 

Upper set 


* i 

Hi 1 

n 

Lower set 

Kfl 

* 

ft 4 | 

n 

Totals . 

1 

n 

n 

2 n 


We have dwelt at length upon the analogy between the two constructions 
because one of the principal difficulties in working with order statistics is to 
design a mathematically workable model. The author has found it fruitful 
when constructing systematic statistics to study a workable analogy which does 
not involve the order statistics directly, and then to build upon correspondences 
such as those described. 
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Some may not wish to read further in this paragraph when they are informed 
that asymptotically the variance of p* is essentially the same as that of p. They 
should proceed to page 404. For the others we proceed to the demonstration. 

Suppose we draw a sample of N pairs of observations ( x[ , y' { ) from the bi- 
variate normal (20) . If we discard from these all pairs except those with the 
n largest x, and the n smallest Xi , we are left with the right set and the left set. 
We shall need the joint distribution of x n4 i and x K _„ 

(36) J(z„„ *„-.) - m , (N _ a „ _ 2) , ( j__ « <*) dx) 

a IN - n \ -V— 2 n—i / \ n 

g{x)dx\ (j g{x) dx) g{x n+ i))g{x„- n ). 

where g{x) is the marginal distribution of x obtained from (20), N(x, a, <rl). 
We assume x„+i , xy_n satisfy Condition 1. Considering x n4 i , x 4 v_„ as fixed 
and given for the moment we wish to look at the distribution of the y coordinates. 
We may consider the y coordinates of the observations in the right set as drawn 
from the distribution of y 


<p'(y) 


f y) dx 

J *AT-n 

f f(x, y) dx 

r ® r°° ~~ 

/ / fix, y) dx dy 

J XAT- rt 

/ g(x) dx 

" x N—n 


Similarly the y coordinates of the observations belonging to the left set maybe 
considered as independently drawn from 


t'iy) 


1 

" z f»+l t 

f{x, y) dx 1 

-oo J- 

,z n+l 

/(x, y) dx 

•00 

c 

r z n+ 1 

/(x, y) dx dy 
'—00 ' 

r x n+l 

g{x) dx 

V— CO 


To prevent confusion, m considering the y order statistics of the two sets, we 
shall designate those of the observations which are members of the right set 
by ui , u* , • •,!!„; while those observations belonging to the left set will have 
their ordered y coordinates designated «i , ,•••,»» . Of course the u’s and v’s 

separately satisfy an order relation like that given in (1) . 

The first question we answer is : given x n+ i , Xn~„ , what is the probability 
that when we collate the u’s and v’s and split the observations into the upper 
set and lower set (see iv) . there will be exactly c observations in the lower set 
whose y coordinates are designated by u’s? In other words what is the prob- 
ability that exactly c members of the lower set belong to the right set? An 
example for small values of n may clarify the problem. Suppose n = 4,‘ and 
we observe Ui < t>i < < v s < < us < ti* < Ui ; the y coordinates of 

the lower set of observations are ui , vi , , v 3 , and only the observation with ui 

for its y coordinate belongs to the right set, so for this case c = 1 . To return 
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to our general problem, the probability that there are exactly c observations 
which are members of both the right set and the lower set is 

(37) Pifi | £ n+ l , - 1 ~ pfa-c > Uo+l) ~ p(_U c > 2V_ C+1 ), 

where p(w > z) is the probability that w is greater than z. Now writing <p(z) — 
f <p'(l)dt, ’ft(z) = f \fr'(i)dt we may rewrite (37) as 

«/— 00 J— CO 

P(c | .'Pn-f-lj 

„ a v = 1 - "r? TTi r - *(v n -')YMv.-') 

(38) c!(n — c — 1)1 Ju c+l 


dv n 


- jP [^0v* +1 )r c [l - iKn n _ e+1 )rV'( y „_ c+1 ) dtv 


■c41 


After integrating the first integral of (38) by parts and simplifying we can rewrite 
(38) as 

n 1 


(39) 


P(C | X„+i, Xtf—n) = e c y , [tA(Wc+l)]"~ C [l — 'i'{^r.+l)Y 


+ 


ft! 


- 1)1 L 




a n ~ c (l - ay- 1 da. 


(ft — c)!(c — 1)1 ./*(«„) 

We approximate the integral term of (39) by 

( n -~)W~T) ! [ * (Uc+i) ~ - vKmc + 1 )] c " 
which leads us to the approximation 

P(c j Xn-r l , -T.v ;l) 

(40) n \ 

Wft c+1 )r c [l - ^(Mc+or 1 !! + (C~ l)*M - c*(«.)]. 


(n — c)lc! 

The joint distribution of u c , w £+ i is given by 

Q(U C , ftc-fl | n) 


(41) 


rr 


^(%) c - 3 (i - v (w m )r~*-y(to«>'(uc +1 )- 


(c — 1)! (n — c — 1) 

Next we multiply P as given by (40) by Q from (41) and integrate out u „ . This 
gives us except for terms of 0 and higher 


n!n! 


[<p(m c h)3° 


( 42 ) cl(n — c — l)lcl(n ~ c)! 

* [1 — tp(u c +i) ] n_c_1 [\l'(Wc+i)] n_c [ 1 — ip(u c+1 )] c <p'(u c+ {). 
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When expression (42) is multiplied by (36), we finally get the approximate joint 
distribution of c, x n +i , xn-n , itc+i . 

Before proceeding further we let 


<p{Uc+ 1) = 


£ «o+i r* 3 

/ f(x, y) dx dy 

00 ny-n 


(43) 


where 


1 - X 2 * 

■u«+l fZii+l 


♦ 

P< 


1 - X? 


^(u, + i) = 


r«e+l 

jL i « ^ d y 


X? 

f x N~n 


Pi 

Xf 


* f “ +1 * /■**-“ * 

Xi = I g(x)dx, X 2 = I g(x)dx. If we also let pi = 1 — 

J-ao J— aa 


X 2 — p* , Vi = Xi — p 2 we can write 
72 (c, , 'Uc+l) 


(44) 


TS'fs* \ *\ A? — 2fl-— 2 afcn— e — *n— c */ > */ \ */ 

— ix(A2 Ai ) Pi ^2 ^3 P4 p4 Al ^2 , 


where the primes indicate derivatives of p* , , X* with respect to the appro- 

priate suppressed variables, u c+l , x„ + i , x K -„ , respectively. 

We now proceed to the maximum likelihood est ima te of p. We take the log- 
arithm of (44) and then take partial derivatives with respect to a the mean of 
x, b the mean of y, and p the correlation coefficient. After equating these 
partial derivatives to zero we have the following three maximum likelihood equa- 
tions which must be solved simultaneously to obtain the estimates d*, b*, and p* : 


(45) 

(46) 

(47) 


N _ 


N - 2n - 2 3(X* - Xf) « - c - 1 dpf 
X* — Xf da p* da 


P* 

c dpt 


n — c 


i r?i 

Nl~ 

Ni- 


pt 


- c - 1 dpi 

pi db 


C_ dp^ 

tflb 


i ^ . 

ni. ' _* at i 


JL J!1 4- 

p* 3a 

n - c 3pf , 3pT”l _ 0 

36 “*■ pf db J ’ 


dpt 4- JL ^£*1 = o 

3a p? 3a J 


pf 


— c — 1 dpi , c dpt . n — c 
p? 3p p* dp pf 


3p? , 5pT _ q 
3p pf dp J 


where terms 0 (j^j have been neglected. Equations (45) and (46) are satisfied, 

again except for terms 0 , when d* = ^(r n+ i -f r„_„) , 6* = « c +i . Using 

this information we examine (47) and find it satisfied when 

n — c p* 

(48) ' = * , 

c Xi - pi 


which is directly analogous to equation (27), and is the form promised in (35), 
if rtf = n — c. The estimate p* is obtained by solving (48) for p, where pi = 
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J pW -00 

I / f(x, y) dx dy , and f(x, y) is given by (20) with variances equal unity and 
0 

f* * 

means equal zero, and / g(x) dx = Xi = 1 — X 2 = n/N. 

We shall not go through the derivation of a~(p*) here. The usual maximum 
likelihood technique may be used. It turns out that the covariances between 

(?)■ 


&* and p* and between b* and p* are 0 
the variance is 


N egleeting such terms we find that 


(49) 


_ P i On “ Pi) 
[p ) 2N\f pf* ' 


To summarize: if a sample of size N is drawn from a normal bivariate popula- 
tion with unknown parameters, the maximum likelihood estimate of p based on the 
2 n observations composed of those observations unth the n largest x coordinates and 
the n smallest x coordinates , may be obtained by solving for p the equation 


n — c __ p i 
X s ’ 


* r r 

where ^ > X* = n/N > 0 ,pi - f{x, y \ <r x = 1, a* = a u = 0) dx dy, 

JO J I,' 

/ N(x, 0, 1) dx ~ X*, and n — c is the number of the 2 n observations with 
Jk • 

largest y coordinates, which also have largest x coordinates. The variance of this 
estimate p* is given by 

— V 1 iXi ~ Pi ) 

° {p) ~ ' 2N\*'pP ’ 

and for p ~ 0 the variance is minimized by choosing X* = .2702, that is by choosing 
that 27 per cent of the observations with largest x coordinates, and that 27 per cent 
with smallest x coordinates, and for this value of X* 


v 2 0 pt (p* | p = 0) = 


1.939 

N 


Equation (49) is of course exactly analogous to the expression, given in (31) 
for the case of known jmeans and variances. Therefore if the variance minimiza- 
tion problem is solved in general for the case of paragraph 5A, the large sample 
solution of the problem for unknown means and variances will also be solved. 

Figure 3 may be used to obtain the estimates p or p* in case the methods of 
paragraphs 5A or 5B arc used. Essentially the figure solves equations (27) 
and (48) . The procedure for the problem of paragraph 5A is 

i) when n x 4- n 3 > m -4- n 4 evaluate the ratio — ; — ~ - — ; — = x 0 and 

pi + Til + 713 + Ui 
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find the intersection of the line x = xa with the curve for the particular X being 
used; 

11 ) through the point of intersection of the vertical line cc 0 = x and the X 
curve draw a horizontal line; 

iii) the value of p is indicated on the vertical axis at the point of intersection 
of the horizontal line and the vertical axis, 



Fig. 3 Curves for Estimating the Correlation Coefficient p 

m Tii 

iv) when ni + n 3 < n 2 4- use the ratio x 0 = ; ; ; and follow 

' ni + n 2 + m + n 4 

the same procedure, p will be the negative of the number appearing on the 

vertical axis. 

Example. Suppose a sample of 1000 is drawn from a normal bivariate popula- 
tion for which the mean of x is a, and the mean of y is b, and the variance of 
x is Vi , all three parameters known (it is not necessary to know <rj). The xy 
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plane is cut by the three lines x = a =fc k<r x , y = a, where, say, k = .612, so 
that X = .27. Suppose we find the observations are distributed as follows: 

in the upper right-hand comer: 160 = ni 

in the lower left-hand corner: 170 = n 3 

in the upper left-hand comer: 110 = nj 

in the lower right-hand corner: 110 = n\ . 

To estimate p we set up Xo = (ni + n 3 )/(n i + + «a + n t ) = 330/550 = .6. 

Referring to Figure 3 we find that the estimate of p, p = .20. 

In using Figure 3 for this case it is useful to know that for 


X = .50 

k = 0.000 

X = .27 

k = 0.612 

X = .40 

k = 0.253 

X = .20 

k = 0.841 

X = .30 

k = 0.524 

X = .10 

k = 1.282 


If the means and variances of the variables are unknown, we may use the 
method of paragraph 5B: 

i) when n — c > c evaluate the ratio (n — c)/n = x 0 , and find the inter- 
section of the line x = Xa with the curve for the particular X t being used ; 

ii) through the point of intersection of the vertical line x 0 — x and the Xi 
curve draw a horizontal line; 

iii) the value of p* is indicated on the vertical axis at the point of intersection 
of the horizontal line and the vertical axis; 

iv) when n — c < c, use the ratio c/n = x B and follow the same procedure, 
p* will be the negative of the number appearing on the vertical axis. 

Example: Suppose a sample of 1000 is drawn from a normal bivariate popu- 
lation with all parameters unknown. Suppose we set n = 200, and follow 
the procedure given m paragraph 5B of this section, and suppose we find the 
observations are distributed as follows: 

in the upper right-hand corner: 50 = n — c 

then of course 

m the lower left-hand corner: 50 = n — c 
in the upper left-hand corner : 150 = c 
in the lower right-hand corner: 150 = c 

The estimate this time is clearly negative, so we set xo = c/n = 150/200 = .75. 
Referring to Figure 3 we find using the curve corresponding to X = .20 that 
the estimate of p, p = — .44. 

5C. The use of averages for estimating p when the variance ratio is known. 
Nair and Shrivastava [12, 1942] have considered the use of means for estimating 
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regression coefficients when one observation is taken at each of n equally spaced 
fixed variates, x t (i = 1, 2, • • ■ , n), and y is normally distributed. Their pro- 
cedure was essentially to consider the ordered fixed variates, and to discard a 
group of observations in the interior, much as we discarded the set of observa- 
vations whose x coordinates were x n +i, x n+i , ■ ■ , av_„ in paragraph 5B. The 
resulting estimates depended essentially on the averages of the y’s on the right 
and left sets of observations, and on the averages of the fixed x’s m the two 
sets. 

In an unpublished manuscript George Brown has considered a problem even 
more closely related to the one considered in paragraph 5A Suppose x and y 
normally distributed according to (20) with equal variances a 2 , and means 
equal to zero, (The ratio of variances must be known, equality is unnecessary.) 
Retain only those observations for which | x, | > ka, and from them form the 
statistic 


(50) 


Pb 


y+ - v- 

X + — X-’ 


where y + and x+ are the average of the n, x’a and y ’ s for which x, > k<t and 
y _ and are similarly defined foi the rh observations for which x t < — ka. 
Then p B is an unbiased estimate of p. Regarding the x’s as fixed variates it 
turns out that 


(51) 


Apb) = 


(1 ~ pV (1 i\ 

(x+ - xJ) 2 \Th + Ih) 


If we approximate by substituting expected values for observed values (55) 

turns out to be (1 — p 2 )a\/2N[g(k)f, where X = f g(x ) dx, g(x) = N(x, 

J— oO 

0, 1) . The value of k which minimizes this expression is our old friend fc = .6121, 
which gives X = .2702 Therefore for p = 0 and large samples, the minimum 
variance is approximately 1.23 a /N, for an efficiency of about .81 . The relative 
efficiency of the methods of paragraphs 5A and 5B are .635 compared with the 
present technique. 

We presume that the analogous order statistics construction would produce 
much the same result. Our mterest in the present technique is to supply an 
approximate answer to the question of what is to be gained by going from the 
counting technique proposed in paragraph 5B to the next level of computa- 
tional difficulty — addition. 
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THE NON -CENTRAL WISHART DISTRIBUTION AND CERTAIN 
PROBLEMS OF MULTIVARIATE STATISTICS 1 

By T. W. Anderson 

Cowles Commission for Research in Economics 

1. Summary. The non-central Wishart distribution is the joint distribution 
of the sums of squares and cross-products of the deviations from the sample 
means when the observations arise from a set of normal multivariate populations 
with constant covariance matrix but expected values that vary from observation 
to observation. The characteristic function for this distribution is obtained 
from the distribution of the observations (Theorem 1). By using the char- 
acteristic functions it is shown that the convolution of several non-central 
Wishart distributions is another non-central Wishart distribution (Theorem 2) 
A simple integral representation of the distribution m the general case is given 
(Theorem 3) . The integrand is a function of the roots of a deteiminantal equa- 
tion involving the matrix of sums of squares and cross-products of deviations 
of observations and the matrix of sums of squares and cross-products of devia- 
tions of corresponding expected values. 

The knowledge of the non-central Wishart distribution is applied to two gen- 
eral problems of multivariate normal statistics. The moments of the gen- 
eralized variance, which is the determinant of sums of squares and cross-products 
multiplied by a constant, are given for the cases of the expected values of the 
variates lying on a line (Theorem 4) and lying on a plane (Theorem 5) The 
likelihood ratio criterion for testing linear hypotheses can be expressed as the 
ratio of two determinants or as a symmetric function of the roots of a deter - 
mmantal equation In either case there is involved a matrix having a Wishart 
distribution and another matrix independently distributed such that the sum 
of these two matrices has a non-central Wishai t distribution When the null 
hypothesis is not true the moments of this criterion are given in the non-central 
planar case (Theorem 6). 

2. Introduction. The well-known Wishart distribution is the distribution of 
the sums of squares and cross-products of deviations from the sample 
means of observations from a multivariate normal distribution. If the 
expected values of the variates change from observation to observation 
(with the covariance matrix constant), the distribution of sums of squares and 
cross-products is the non-central Wishart distribution. This distribution has 
been given explicitly [1] for the simple cases of the non-central problem. If we 
think of the expected values of each observation as defining a point m a space of 
dimensionality equal to the number of variates, we can say that the cases 
handled are those in which the points corresponding to a sample lie on a line or 


1 Part of a thesis submitted to the Mathematics Department of Princeton University in 
partial fulfillment of the requirements lor the degree of Doctor of Philosophy, June, 1945 
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a plane. Although the explicit formulas for the distribution of higher rank are 
extremely complicated and have not been derived, the characteristic function 
is relatively simple. The distribution in general can be given in terms of a 
simple multiple integral 

The Wishart distribution is the basis of much of the sampling theory associ- 
ated with the multivariate normal distribution. It plays a role similar to that 
of the x 2 -distribution in univariate normal theory. It can be used in deriving 
the distributions of the generalized !F 2 and of the multiple correlation coefficient 
when all variates have a normal distribution; it is used in deriving the moments 
of the likelihood ratio criterion for testing the general linear hypothesis (including 
the test of the means of several populations being equal) as well as deriving the 
moments of other such criteria 2 ) . For the problems of the T 2 and the test of 
the linear hypothesis and many other problems, the non-central Wishart dis- 
tribution must be substituted for the central Wishart distribution when the 
null hypothesis is not true. That is, the non-centra) distribution can be the 
basis of obtaining the power function for many tests in multivariate normal 
statistics. As an example of the application of the non-central Wishart dis- 
tribution to these problems, in this paper we obtain the moments of the gen- 
eralized variance and the moments of the criterion for linear hypotheses when 
the population means lie on a line or a plane. Applications to other problems 
such as testing collindarity, comparing scales of measurement, and multiple 
regression in time series analysis will be published in a later paper [3]. Another 
problem to which this non-central theory can be applied is a method of estimat- 
ing the parameters of a single equation of a complete system of linear stochastic 
difference equations (developed by T. W. Anderson, M. A. Girehick and H. 
Rubin). 

In [1] it was shown that one can make linear transformations on the observa- 
tions which simplify the derivation of the non-central Wishart distribution in 
the linear and planar cases. Consider a set of N multivariate norma) popula- 
tions, each of p variates. Let the z-th (i = 1, 2, ■ ■ • , p) variate of the a-th 
(a = 1,2, •••,2V) population be z, a ; let the mean of the variate be 

(1) E(x,a) = Via (t = 1,2, •••,?;«= 1,2 

Let the covariance matrix (of rank p) common to all N populations be 

II E{x, a Ml a) (x j a Mia) II = || Gij || 

( a = 1, 2, •••, N). 

The probability element of the x ia can be written as 

(2) | a ' 1 1 iw (2x)“ ! ^ exp [- i £ - *«)(*,- ~ M^HI <**... 

i,1, a x,a 

where 



See Wilks [2] for example. 
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The sum of squares and cross-products of deviations from the means in a sample 
{£,•„) are 


( 3 ) 

where 


a»j ^ ^ (Xia X])j 

a -I 


1 N 
x* = Tf Z 

iV a -1 


The dimensionality, say t, of the space spanned by || m* || is equal to the rank of 

( 4 ) i! Til II = II Z (w« — M i) |l , 


where 


fil* = A?Z 




As a result of a linear transformation it was demonstrated that the distribution 

N - 1 

of a,, is the same as that of Z x '* x 2 where the x la have a normal multi- 

a— 1 

variate distribution with covariance matrix || <r,,- 1| and expected values 

(* = 1, 2, • • •, p; « = 1, 2, * • •, N — 1), 


such that 


Til l| = II Z 


The joint distribution of a xi is given for three cases: 

(i) Case t = 0: 

(5) W(a„ , (Tij , r„ ; p,N- 1, 0) = K, j c 3 1 | o„ 1 1( ^ ,) exp [- * Z <r” a,,]; 

(ii) Case t = 1 : 

(6) 17(0,, , a,, ,r„;p,N — 1, 1) = iCx exp [-* Z Aw 1 I «r* I 10 " 4 * I o„ l^^ 


X exp [- hY,*' 3 a ( J Iu^CVEaw t«,); 

(iii) Case i = 2: 

W(a„ , (ft, , Ti, ; p, N - 1, 2) = K t exp [- § Z <r u r J | | 1<w-1) 


( 7 ) 


X ] a„ | 1(y 31 5> exp [ I Z a,,] Z 2 2 “«i!r(^[A^ - 2] + w) 
X (in + I 4(w _) +lw (V5T+1S), 
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where 


if- 1 = ir Wr ~ l) n r(M - *1), 


P—1 


ifr 1 - 3) , WH) nrG(iV - i - i]), 


p— 2 


JvT 1 = r Mp ~ l) n r (W ~ 2 - *]), 


I„(x) is the Bessel function of purely imaginary argument, and u i and u 2 are 
the two non-zero roots of 

(8) | T - X/4 -1 | = 0 

(here T = || r t j || and A = |j a,, j( ) The number N — 1 is the number of 
degrees of freedom and l is the rank, The matrix || erj, j| we shall call the sigma 
matrix, and || t„ |j we shall call the means sigma matrix. 

Let k{ , k| , ■ • , k p be the real, non-negative roots of the determmantal equation 

(9) | T - XS | = 0 

(where S = || || ), There is a non-singular p X p matrix T ( = )| \j/ t , || ) 

such that 

(10) 4'2'L' = I 

and 

(11) *7V = || || 

(where I is the identity, is the transpose of SE' and 5,, — 1 for i ~ j and 0 for 
i j). Then the quantities 

V 

(12) hi, = ^ ®hk 

h,ft— <1 

have the distribution W(b t] - , <5^ , ; p, n, t ) where n = N — 1 and «? = 0 

for i = t + 1, t + 2, ■ • •, p). This is the same distribution that would be 
derived if the ?>,,■ were defined by 

n 

(13) hi, Vi a Via j 

a—1 


where the distribution of the yi a is 

(14) (2w)~ il,n exp [- *££ (y„ - *. 5, a )] 2 . 

1-1 a—l 

This simplified distribution of the observations has been called the canonical 
form 
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3. The characteristic function of the non-central Wishart distribution. We 

shall find the characteristic function of the a„ and 2a , , h ^ j) as defined in (3). 
We first obtain the characteristic function of the f>„ and 2b, 3 (i 5^ j) as defined 
in (13) and then perform a linear transformation to obtain the characteristic 
function of the a,, . The characteristic function of the b„ and 2b, j (i ^ j ) 
is defined as 

(15) E^exp j^a JV 

where 


d,, — 6 j, 

and % m the exponent is the imaginary quantity 
We can write (15) as 

E^exp |\ ED j ED 2A« e„ 

= (2ir)~ it ‘ n [ ■ ■■l exp F — §EDED + » E ED yx«Vi«0>i\ 

J— 0 O j- 00 L t=l a=>l t,j— 1 a= 1 J 

X II XI dy, a . 


Let us first integrate the y, a for i = 1, 2, ■ ■ ,p and a = t + 1, t + 2, • • - , n, 
that is, make the integration 


/ ri-oo *-4 

• • I 

00 J — ot 


t l V n J> n '"I Ji n 

— oED ED yU + i ED ED^.ja-a, II II 

*=1 a = f+l 1,7=1 «=>1 _] *— *1 


This is, however, the characteristic function of a Wishart distribution with 
n — t degrees of freedom [4], namely 

(i 6 ) 1 s„ - 2 id,, r 11 "-* 0 . 

Now we must make the integration 


(17) 


»» [is* i/r-E 


t l -y P ■> * * * “1 P * 

n 23 2/iiJ ^ 2Z "H 23 Vr]T] K v I ^ II XI • 

t=l ij— 1 i,J="l ?|“1 il“l _l *“1 T“1 

There is a p X p matrix G = || , 1 1 such that 


2 ^ » 
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where 


dkh = 5 m — 2 i8kh- 


Let us make the transformation 


2/m Qih saij “h d' , 

A-l 

where 

n^ii-ii^ir. 


Then the exponent of (17) within the integral sign is 



and the Jacobian of the transformation is 

K, I" 1 '- 

Hence, the integral of (17) is 

(1 8) | d w 1“ exp [-1 (± «* - t . 

This result is obviously true if the 0^ are pure imaginary and sufficiently small 
so that || d tj || = || 8,j — 2idi, || (which is real in this case) is positive definite. 
For all complex 0,-j in a neighborhood of the origin (17) converges because the 
real part of |[ d t] || is positive definite. Similarly the integral of the derivative 
with respect to 0,, of the integrand converges for 0,,- in this neighborhood. It 
follows that the (complex) derivative of the characteristic function exists in this 
neighborhood because the derivative of the integrand is measurable and is abso- 
lutely integrable- Therefore, the characteristic function is analytic in a neigh- 
borhood of the origin. From this it follows that the characteristic function is 
analytic in an open set containing the flat space of real 6 i} . By analytic con- 
tinuation, then, (18) is the value of (17) in the open set containing real 0,-, . The 
characteristic function (15) is the product of (16) and (18). Accordingly, we 
have the result that the characteristic function of the 6„and 26^ ( i^j ) defined 
by (13) is 

(19) | d i! | l ” exp [-g (g ~ £ <*"*;)] • 

It is clear that if k, = 0 (for all 7j), this function reduces to the characteristic 
function of the Wishart distribution with n degrees of freedom, namely, 

(20) |5, y -2 

It is interesting to note that (19) factors into two parts, one of which is (20) 
and the other is 

exp (S - S 


( 21 ) 
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The distribution function similarly factors mto two parts, one of which is the 
Wishart distribution, whose characteristic function is (20). Thus the non- 
central Wishart distribution function is the convolution of a function (central 
Wishart distribution) and another (the transform of (21) the first of which is 
a factor of this same non-central Wishart distribution. 

In the planar case the characteristic function can be written as 

exp - d U Ki)] _ exp - d 2 \^] 

| - 2 iff* I 1 " 1 ‘ | S tl - 216,, |*i ’ 

where ni + iij = n. From this fact it is clear that the distribution for the 
planar case (if n > 2p + 2) is a convolution of two distributions each of the 
linear case. 

This deduction can also be made from the distribution (14). Let 

71 l+l 

bi] 3/tlV/l “l" ^ f 

a— 3 
n 

b"j = VtiVji + £ y*a y,a' 

a^ni+2 

Then it is clear that the b{, has the non-central Wishart distribution with ru 
degrees of freedom and parameter k\ in the direction of the first coordinate 
axis, while the b„- has the non-central Wishart distribution with th. degrees of 
freedom and parameter k 2 in the direction of the second coordinate axis. Since 

6., = K, + fc'4 , 

the distribution of the b tJ is a convolution of the distributions of b[, and b ", . 
In general the non-central distribution is the convolution of t distributions of 
the linear case (provided n > tp+t). 

It is easy to show that if one has two (or more) non-central Wishart dis- 
tributions of rank 1 with parameters m the same direction, the convolution is 
again a non-central Wishart distribution with parameter in the same direction. 
Suppose b[j and b", have non-central Wishart distributions with parameter k'I 
and k"\ in the direction of the first coordinate axes and ri\ and degrees of 
freedom respectively. The characteristic functions are 

[rf f/ | ,n, exp -dV)] 

and 

| d ' 3 ('"’exp [-Kki* - d U Ki*)]- 

The product is 

| d xi |'"exp [— %<k! - d U K?)], 

where n = m + nt and k\ = k{ + k"\. 

Now let us deduce the characteristic function of the a,< and 2a; ,• {i ^ j), 

p 

Since by (12) the b’s are transforms of the a’s we can write a,,= \p' h \p ,k ?>« . 
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Then 

(22) E (exp jji £ a^]) = E (exp ^ ^ 4>>i'P' h 'P ll ‘b hk 
where 4> {] - <fi jt . If we define 

(23) 9 hk = £ *.,**V, 

then (22) can be derived by substituting (23) in (19). 

Let 

* = IU.,11- 

Then 

|| d i3 1| = D = ^ ,_I (2" 1 - 2 i 4>)* _1 

and 

ZT l = *(2 _1 - 22$rV. 

The characteristic function of the o’s then can be written as 

exp [~%{tr(VTV) - ir[*(2 _1 - 2zVt>)~ V*? 1 *']}] 
|| *'~' 1 1 - IS" 1 - 2i$| • I^-'I} 1 " 
using (10) and (11) The denominator is 

||*' | |*| }-*"| 2“‘ - 2*I»| in 

and the numerator can be written as 8 
exp [- J(tr (MWM) — tr [M'*'*(2 _1 — 2i*) -: V*M] ) ] 
where 

M = II M>« - Hi || 

and 

M’M = T. 


We may summarize in the following theorem: 

Theorem 1. Gwen a ,y (i, j = 1, 2, ■ ■ • , p) defined by (3) where the x, a {i = 
1) 2, • ■ , p, a = 1,2, ■ • ■ , N) are distributed according to (2), the characteristic 
function of a u and 2ai, (i ^ j) is 



1 The result follows from the fact that tr(AB) = tr(BA). 
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where 

ii ii ^ii = 11 ii 

and 

4>t) — 4>ii • 

Suppose we have two sets of quantities a[ 3 and a ", each set of which is dis- 
tributed according to a non-central Wishart distribution with sigma matrix 
|| o-' J ||, one having n' degrees of freedom (or n"), means sigma matrix r t] (or 
t„ 0 of rank t' (or t ") . Consideration of the characteristic functions (24) shows 
that 

i . it 
Clij — “T 

has a non-central Wishart distribution with matrix || H, rt' -f- a" degrees of 
freedom and a matrix 

11^11 = 11^11 + 11^11. 

The rank of the distribution is equal to the rank of || r X] ||. This result can 
also be deduced from the representation of a'„ and a", in terms of observations 
from non-central normal populations. It is a straightforward generalization 
of the same result for central Wishart distributions. 

Theorem 2. The convolution of two or more non-central Wishart distributions 
with identical sigma matrices is a non-central Wishart distribution with means 
sigma matrix equal to the sum of the means sigma matrices of the components. 

4. An integral representation of the non-central Wishart distribution in the 
general case. It was shown in [1] that 

W(b t] , 8a , k] S., ; p, n, t) = Ce~ itrB J \ B - YT e ,r(JC ' y) dY 

where 

c _i = p[ r( |[ n - t + 1 - »D, 

dB = n n db x , , 

t-1 1-t 

dY =nil4„ 

l—l 

B = II 6.,- II, 

Y = Hifell 

K = ||a.^|| 

and the integration is on Y over the range 1 1 B 
This is equivalent to 


0» = 1, 2, •••,*), 
0 7 - 1,2, 

— YY' 1 1 positive semi -definite. 
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(25) 


Ce~ i,rB j B J | / - Y'B- 1 Y |K»-i>-'‘-u e ' r,r ' y) &Y. 


The integration is over the range of Y for which [| / — Y'B'' 1 Y \ | is positive 
semi-definite. 

There is a p by p matrix H — \ \ htj \ | such that 

H'B-'H = I 


H'K = W = ||u>A,||, 

where w] are the roots of 

| k% } - W 1 1 = 0 , 

Then make the transformation to Z = j| z,, || by 

Y - HZ. 


The Jacobian of the transformation is 


\H\ l = \B\ it . 

Then (25) can be written as 

(26) Ce- i,rB | B J 1 1 - Z'Z e ,rwl ' z) dZ. 


Partition 


Z = 


z 1 

Zi 


such that Z\ is square (( X t). Let I — Z[Z\ = E'E, (in terms of Zi), where 
E is specified uniquely and consider the transformation of variables from Z 2 
to V defined by 


Zi = VE. 


Then (26) can be written as 


Ce 


,—ilrB | 


B 


iKn-P-U 


dB 


[v-K 


Zn 




dZi 


■ f \I~ V'V\ iln - p ~ t ~ 1) dV i 

where 

Wi — ||ia,5, t || (it, £ = 1, 2, 0- 

The first integration is over the range (I — ZiZi) positive semi-definite and the 
second is over (I — V'V) positive semi-definite. The value of the second 
integral is 
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/ 


I — V'V l*'"-*-'- 1 ’ dV 


Q r( | [n _ 2 t + 1 - * 1 ) 


S r(i[» - < + i - * 1 ) 


Hence (26) can be written as 
(27) C 1 e-' i ‘ rS \B\ iu '- p ~ 1) J 1 7 


2' ^ |iC"~ s, ~U fViM* 


dZi 


with 

Cl 1 

(28) 


itr(K'K) nipn lp(P-l)+i< ! 

e & v 

x n r(i[n - 1 + 1 - *D n r(i[ft - t + l - fl). 

1—1 1—1 


The first part of (27) is, except for a constant factor, a central Wishart distri- 
bution with n degrees of freedom. The integral of the second part is obviously 
a symmetric function of the to, . In terms of the o,y the to, are simply the 
roots of (8) . We can sum these results in a theorem. 

Theorem 3. Given a sample of observations (x ta ) (i = 1, 2, • • •, P\ a ~ li 2, 
•••, N) distributed according to (2), the probability density function of the sums 
of squares and cross products of deviations from, the sample means defined by (3) is 


M | ^ |»CJW— P~2> 


W(a xi] <r„ , T.y ; p, N — 1, t) — Cl | o 

■exp a’ a„ J J j S, ( - £ Z.,z.{ 


HN-u~n 


t t 

■ exp^to,2,i XI ds,( 

i—i ii{— t 


integrated over 

i 

Sij£ 2 
1—1 

positive semi-definite where C\ (n = N — 1) and t,, are defined by (28) and (4), 
respectively, and where w\ are the t non-zero roots of (8). 

5. The moments of the generalized variance in the linear and planar non- 
central cftses« 

5.1. The linear case. The generalized variance, which is the determinant of 
the variances and covariances,* is a measure of the spread of the observations. 
If one thinks of the N observations of each variate as a vector in iV-space with 


4 This definition of Wilks [5] was made in terms of vananees and covariances defined by 
a<,/N (from equation (3)) Since we consider a,,/ (N-l) to be the variances and covariances 
we define | aw/ (N-l) | as the generalized variance. 
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origin at the sample mean, the generalized variance is proportional to the square 
of the volume of the p dimensional parallelotope which is defined by these 
vectors as principal edges. Another geometric interpretation can be given in 
terms of the p-dimensional variate space. The generalized variance is pro- 
portional to the sum of the squared volumes of all possible parallelotopes that 
can be joined by choosing as the p principal edges p of the N sample vectors 
(origin at the sample mean). 

In this section we consider the moments of the generalized variance when the 
distributions of the observations are non-central multivariate normal. In 
terms of the first geometric representation this means that the center of one or 
more of the vector distributions is different from the others. For convenience 
we shall assume that the distribution of the observations (p,„) is according to 
(14). This will give as much generality as if we treated observations {x,„) 
having the distribution (2). Moreover, we shall consider the determinant of 
sums of squares and cross-products instead of the determinant of variances and 
covariances. It is clear that the determinantant | &<, |, defined by (13), is 

simply a multiple (by | 2 | (N — 1)”) of , defined by (3). 

Let us first consider the linear case, i.e., k = «i ^ 0 and *,• = 0 (i = 2, • • • , p) 
m (14). The first of the p vectors is centered on the first coordinate axis, not 
at the origin. Then the probability density function of the h, is 


-)«’ | l | J(n-ji-l) 


I-ifxl - 

L '- 1 j V 


2 h \ ivip ~ 1] n r(*[» - *J) 


y 

a-o 2*“a!r(}tt + a) ' 


We wish to find the moments E( \ bij l^). Let 


bij = s, 8 ,r,y . 

Then s 2 , is the sum of squares of the i-th variate and || r,-y || iB the matrix of 
sample correlation coefficients. The Jacobian of this transformation (to a 2 , 
i'll) is 

(s?) ,( ^ i; (4) 1(p_1) (s 2 P ) 5<p - 1) . 

The probability element of the s 2 ’ s and r’a is 


( 30 ) 


exp [-*k 2 s?]n(fi ?) ,n_1 

2 b v lp<? - 1) n r(*[« - 0) 

<-i 


I r M I 


4(n-p-l) 


» t 2 2\d 

x y 

«.o 2 2 a a!r(|w + a) 


r i v 

n dti) n n 
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It is clear from (30) that the s( are distributed independently and that the set 
have a joint distribution independent of the si’s. Hence 


£(| K |‘) = S(| fi, a, r ti [*) = n E[(sl) k ]-EQ n, | ft ). 

i— 1 


si). 


The probability element of si (i = 2, 3, • • • , p) is 

2»T(Jn) d{ i) ’ 

which is simply the x 2 -distribution. The h-th moment of s\(i = 2, 3, • • , p) is 

E[(sl) h ] = (l» + h) _ 

r(§n) 

The probability element of si is 

/ 01 \ g w) g 1 Y* (« *l) 

v ; 21" “o 2 2 “a!r(Jn + a) 

This is the x'^distribution (non-central x S -distribution) which was given by 
Fisher [6]. Applying term-by-term integration (the series converges properly) 
we get the fi-th moment 

E[{Sl) 1 " ** h 2“alr(|n + «) ' 

The probability element of the r t) is the well known distribution of correlation 
coefficients, 

T p ~W I r„ 

r-i II II dr„ . 

n r(J[n - i]) ’" 1 , “‘ +1 

4-1 


Since 


/'A 


p— 

li(n-p-l) p p II r($[« - i]) 

I n II *ru = -T 

»-i ,_,+i r 11 ^fn) 


i) 


where the integration is over the entire (permissable) range o the j\y , we have 
as a consequence the h-th moment of the determinant (since n is arbitrary) 


p ,| _ r^(^n) f\ u, ^ , 

®(l ^3 I ) — p — i I ip(p-i) 11 11 

n r(i[» - f]) T - l '“‘ +1 


n r(M« - i\ + h) r^0») 

4—1 

fi r(iin - iDr^On + h) 

4-1 
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Hence, the h-th moment of | s.S/r , } | is 


(32) 


n r(il» 4-2 h- ♦]) 

2>>a i'-i e -l«* 

0 r(it» - i\) 

i ~> 1 


4 a F(-§-n Hh A + a) 
a»»0 2"air(fn + a) 


Let us summarize this in a theorem for the a ti . 

Theorem 4. If the quantities a,j (i, j = 1, 2, >•*, p) have the distribution 
W (an , ffiy , rtj ; p, N — 1, 1) defined by (6), then the moments of\atj\ are given by 


m «./D 


<r tf |*2 T V'' 


tlnm-i-A+h) 

i-i 

U r(HAr - 1 - fl) 


**“r(j[iv - 1 ] + A + a) 
2 “air(MA r -!] + «)’ 


where k is the non-zero root of (9) . 

The /»-th moment of the generalized variance | a ti /(N — 1) | is obtained by 
dividing the above expression by (N — l)**. 

If k = 0, expression (32) clearly reduces to the moment given by Wilks [5] 

1*1 r(}[« + 1 ~ H h) 

(33) 2 vh i=i . 

ft r(Kn + l - ii) 


The expression (32) gives the moments of the generalized variance when the 
means of the observations are not fixed, but lie on a line. The distribution of 
I btj 1 is not a simple function even in the central case. However, in any par- 
ticular case one could find the first few moments of | | and fit a distribution 

function. It is to be noted that the convergence of the serieB is nearly as rapid 
as that for e J ‘\ 

5.2. The planar case. Next we shall treat the planar case for two dimensions. 
Suppose that 4 ^ 0 (z = 1, 2). The probability density function of f> u , bn , 
8 /nd 622 is 


exp -£(4 + 4) - 4 2 K 

- t-X 

(34) 2"V^ 

v- ''P [4 4(511 5i2 - 5u)] a («i hu Kib&y 

A 2*“+ 2 "c*!/Sir(i[n - 1] + a)r($n + 2a + P) ' 

Let bn = «* , b 2 i = at , and b 12 = SiS 2 r. The Jacobian is SiS 2 . The probability 
element of s; , si and r is 



e -K«?4«p 


x V (444 4)"(1 - r a )°(4ai + 4 elf d(s\) d(s\) dr 
,£o “ai-+»PaI^ir(4[n - 1] + a)r(^n + 2a + 0 ) ’ 


(35) 
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We wish to find { [sfs|(l — r s )f}. Let us first multiply (35) by (1 — r 2 ) h 
and integrate from — 1 to +1. We then obtain 

y (4 *2 44)° (4 4 + *2 Sa)^r(f[n — 1] + h + a) d(s'f) d(sl) 
2 < “+^a'/9!r(Mn - 1] 4- a)r(|7i + 2a + 0)T($n + h + a) ‘ 

Next we multiply by (s?) fc (sl)\ set (44 + 44//01 equal to 

2 (44)*(4 

ftiftl 

and integrate s? and from 0 to » . We obtain 

E([bnbn - ®&f) 

(36) = 2 n exp [- + 4)] 

Y ( K l) a+ %l) a+h r(^n + h + a + ft)r(jn + & + « + A) 
2 ia+h+h a!ft!ftir(*[« - 1] + a)r(4n + 2 o + ft + ft) 

x — 1] + h + a) 

TQn + h + a) 

which is the expected value we are seeking. 

Clearly this reduces to a special case of (32) if 4 is set equal to zero. 

Now we consider the planar case in p dimensions. Geometrically we have 
p vectors in n-space. If the {y, a } are distributed according to (14) the mean 
point (i.e., center of distribution) of the first two vectors is different from the 
origin, but the mean point of each of the other p — 2 vectors is the origin. 
The vectors are distributed independently. The determinant 

n 

I bij | — £ UtaVia 

a— 1 

is the square of the volume of the paxallelopiped which can be expressed aB 
vivt ■ • • v r sin ft sin ft • • • sin 9p-i, 

where v , is the length of the i-th vector and d, is the angle between the (i + l)-st 
vector and the flat space determined by the first 1 vectors. The distribution of 
Vi, ■ • v p and ft , ■ • • , ft_i is statistically independent of Vi, v 2 , and ft ; for 
no matter what the plane of the first two vectors is, the conditional distribution 
of the other variables is the same. Hence 

E{ | b„- I*) = E[{viv 2 sin ft) 211 ] • E[{v3Vi • * ■ a, sin ft • • • sin flp-i) 2 *]. 

If the y’s had simply the distribution 
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(37) 

then the A-th moment of 


1 


JT" CX P 


'-iii a], 

_ 1—1 a — 1 J 


(2 x) 

| hi; | would be (33), and the A-th moment of 


v\ vl sin 5 6 t 


bu 6ia 
bn bjj 


would be 

II r(J[rc + 1 — i) + A) 

2 *-i 

II r(J[« + 1 - *]) 

»W 

Since the distribution of vi , v* , and 0 2 , ■ ■ ■, fi p _i is the same whether the 

y’a are distributed according to (14) or (37), we have 

f[r(M» + 2 A + i-t]) 

(38) E\(v 3 ...v p sin 8, . ■ • sin 9^*) = 2 ,,!p ' 5) ±5 . 

II min + 1 - i]) 

Multiplying (30) by (38) we obtain the h-th moment of | |, namely, 

llr(i[n + 2A + 1 - *]) 

E( | b i{ |*) = 2** exp [- U<1 4- kI)] g 

Urtlln + 1 - il) 

i-8 

a j^ i 1 j r(j[ n - 1] + a) 


r(jw + A + a + ffs) P($[w — 1]+ h + a) 
r(iw 4* 2a + 0i + /3s)r(^w + A + a) 

This result may be summarized as follows: 

Theorem 5. Let the probability density function of the quantities a,-/ (i, j — 
1, 2, • • •, p) be 


W(otj , an , nf ; p, N — 1, 2) 
defined by (7). Then the h-th moment of \ a, j\ is 


Kkf) = K 1*2*’ exp [-*^+*3 


fi r(*[JV -*] 4 - A) 

i—3 

ftrtiUV' — *1) 
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(39) V - 1 ] + h + g + ft)r(| [N -!] + &+« + ftl 

*•>&-* 2 2 a+ei+ 3 i a!/ 3 ,! / 3 2 ! r(«^ - 2] + a)r(§[iV - 1] + 2a + ft + ft) 

x rjP - 2} + A + a) 

r(*[tf-i] + fc + «)' 

with k\ defined by ( 9 ). 

The h-th moment of the generalized variance | a l0 /(N — 1) | is obtained by 
dividing the above expression by (N — l) p \ This formula holds for all h > 
-UN - V ). 


6. The moments of the criterion for testing linear hypothesis in the linear and 
planar non-central cases. 

6.1. The moments of the criterion. There are several linear hypotheses con- 
cerning the means of multivariate normal populations that can be included in 
a general formulation of the problem. We shall first of all consider a simple 
case of a linear hypothesis and find the moments of the criterion under linear and 
planar alternatives. In Section 6.2 we shall indicate some linear hypotheses 
that can be reduced to this simple case. Regression problems and the problem 
of equality of means in several populations (studied by Wilks) are included. 

Suppose the variates z,« (i = 1, 2, • • •, p; a = 1, 2, • - • , n) and (t = 1, 2, 
... p;y = 1, 2, •••, q) have the probability element 
I M |}(n+9) T V « *1 

(2r)‘’(' n+ ‘> ^ L“ 4 S 

(40) 

exp \ X) 23 o'’(y,y — Ii,y)(y,y — Miy)"1 MI dz ia II fl dy iy . 

L I.j— 1 7-1 J »-l 0-1 l-l 7-1 

Let us consider the hypothesis Ha that the means of mp y’s are zero, namely, 


Let 

(41) 

(42) 


H a : p iy = 0 (i = 1,2, • • •, p;y = 1, 2, ■ ■ m) 


dij 


2 a i7 1 
7-1 


n 



a-1 


^)o > 


(43) c,-,' — o„- + bij . 

Then the likelihood ratio criterion for testing Ho , called by Hsu [8] the Wilks- 
Lawley hypothesis, is the y(n + q ) power of 


(44) 
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Under the null hypothesis the 6,7 have a Wishart distribution with n degrees 
of freedom) and the a,j are distributed independently of hi, such that aj has a 
Wishart distribution with n + m degrees of freedom. Wilks [6] has given the 
moments of W and in some special eases the distribution of W. 

We shall now obtain the moments of W for distributions specified by (40) 
where the rank of || I! (7 = 2, • • •, m) is 2, i.e., the planar case. Under 

this assumption the bij have a Wishart distribution with n degrees of freedom, 
the a.v are independently distributed in such a way that the c,j have a non- 
central Wishart distribution with n + m degrees of freedom. Let and 
be the non-zero roots of 


(45) 


S Mu' M /t Xffi, 




0. 


It is clear that the distribution of W is unchanged if a' 1 is set equal to S (j , Fur- 
thermore, we can take n, = Ki5,y then the c, } are distributed according to 
W(cij , Sij , *%j; p,n, 2) with n m degrees of freedom. The moments 
will be obtained by a method similar to that used by Wilks [5]. 

Let the expected value given by (39) be 


(46) E(\e tl \ k ) = K{n + m, h, p, *?), 


which is a constant depending on n + m, h, p, kI , and k\. If D(a,,) represents 
the distribution function of the a, 7 , one can write (46) as 

(47) K(n + m, h, p, k s <) = 

2 ),,n T lp ( p - 1) n r (i [ n + 1 -il) 


[ I Ci, T 1 b„- exp [- * £ &„] Dia { d ] ft ft db u dA 

J y-i 


where dA is the volume element of the a,j , and where the integration is over 
the entire (permissable) ranges of the hj and a, 7 . Equation (47) holds since 
the c’s are functions of the b’s and a’s. Multiplying (47) by 

(48) 2 |pn lir(*[n+l -{]), 

i-l 


then replacing n by g + 2 and dividing by (48) again, we obtain 

2 Mn-W]J r( £ [n + i __ + g) 

(49) K(n + m + 2 g, h, p, k?) ^ 

^Urdln + i-.!) 

1 

2 i r n 7r4 p(p~ l )]J r (^[ rt+ j _ fl) 

<-l 

■flc, N b„ | i(n+I ‘-*- 1 > exp [ - b tl ] Dia„) ft ft db.idA] 1 . 

J L *-i i-i i-j _ . J 
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By definition the right hand side of (49) is the expected value of j c tJ |* | btj j f . 
Hence 

2 * ft r(«n +l-tl + <;) 

H(| c,, |* j | fl ) = K(n + to + 2 g, h, p, «*) ^ 

II r(Mn + l - H) 

i-i 

In this expression it is permissible to set h equal to —g (n could have been 
replaced by n + 2g in (47) to insure the argument of each T function being 
positive). Then we have 

E(W‘) = E ( | c v , p' | k 3 1°) 

2 0 ,, f[r(i[n + l -i] + 0 ) 

= K(n + m + 2 g, — g, p, 4 ) — — ^ . 

n r(f[n + i - ;]) 

t-1 

Finally, the g-th moment ie 

II r(|[n + to + 1 - i]) n r(|[n + 1 - i] + g) 

E(W°) = exp [-ifa* + 4)]g 

II r(M« + to + i — i] + p)XI r(i[n + 1 - fl) 

i~i i-i 

(60) y r + TO] + « + ft)r($[n + m] + « + ft) 

’ + to - 1 ] + g + «)r( 4 [» + to] + a) 

rQfr + to - 1 ] + a) ~| 

' r(|[n + to] + g + 2 a + 0 i + /3,)J * 

We can summarize in the following theorem: 

Theorem 6 Let Zi a (i = 1, 2, • ■ p: a = 1, 2, • • ■, ft) and j/,- r ( i = 1, 2, 

■ ■ ■ , p; 7 = 1 , 2, ■ ■ •, q) have (40) as a joint distribution. Define a,, , bi , and Cij 
by (41), (42), and (43), respectively. Let k\ and 4 be the non-zero roots of (45). 
Then the g-th moment of W, defined by (44), is (50). Expression (50) gives the 
moments of W in the planar case. The linear case is a special case of the planar 
case, that is, it is the planar case for 4 = 0. The 0 -th moment of W in the 
linear case is given by 

IIr(|[n + m + 1 - i]) II r(J[n + 1 - *1 + g) 

E(W°) = exp [—*41 g ^ 

( 51 ) n r($[n + TO + l-i] + 0 )II r(|[ft + 1 - i]) 

t —2 1—1 

y y ( 4 )^_r(|[ft + to] + fti) 

»»‘ft!r(i[» + m] + g + ft) • 

For 4 = 0, (51) reduces to the expresion given for the moments under the 
null hypothesis. 
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Wilks [7] has given the distribution of W under the null hypothesis for 
several special cases (i.e., certain pairs of n and p). In general, however, the 
distribution function is too complicated to write down explicitly. When the 
null hypothesis is not satisfied (i.e., at least one*! 0) the distribution functions 
are yet more involved. Hence, we shall not write any explicitly. 

Hsu [8] has given the asymptotic distribution of W. Suppose that 


T„ 


-it 

*.)-! 7-1 


n»7 Mjy 


tends to the limit T 0 as n tends to infinity (if the n’s are functions of n ) . Then 
the limiting distribution of x = — (n + q) log W (which equals —2 log A, where 
A is the likelihood ratio criterion) is 


(52) 


n~ipm -H'o.Jpm-l ~ix 3 Y"' ^0 ^ 

2 2 “ a ! T(\pm + a) " 


That is, it is the x* distribution with pm degrees of freedom and parameter To . 

For most purposes, alternative hypotheses of the meanb being on a line 
(ie., of rank one) are sufficiently general. In any particular case, one can 
compute from (51) mumerical values for several moments and then fit an appro- 
priate distribution function. If one wishes to consider alternative hypotheses 
of rank two, one can use (50) and similarly compute numerical values for mom- 
ents. The series in either (51) or (50) converge rapidly. To construct an 
approximate power function for linear alternatives, say, one would fit distribu- 
tion functions for several values of «i and find the desired percentage levels. 

There is a matrix \\d,j\\ such that 


IIMIHIM-IKiir 


and 


II II -Hull’ll**, II ‘II dull', 

where the A’s are roots of 


(53) | <j„ - \ln | = 0 

It follows that 

II c u II = II do || * || (1 + 1| • || dij ||'. 

Then W can be written as 

\dij\-\diA' 

K'l ■ I(l+X,)«d • Idol' 

1 

n a + x,) 


( 54 ) 
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The distribution of the roots of (53) in the linear case has been given by Roy 
[9] for ch, o dimensionality p. 6 The distribution m the planar ease has been 
indicated by Anderson [3]. One could obtain the probability of W not ex- 
ceeding a given value by integrating the X’s over the proper range. 

6.2. Examples of the general hnear hypothesis. A number of hypotheses 
concerning the expected values of variates with multivariate normal distributions 
can be put into the form of H 0 The equivalence of the hypotheses is demon- 
strated by means of linear transformations. 

As an example consider the hypothesis Hi that the means of several normal 
multivariate populations are equal when the respective covariance matrices 
aie equal. Let be the a-th (a = 1, 2, - ■ •, N u ) observation of the i-th (i = 
1,2, ■ • ■ , p) variate in the u-th (u = 1,2, - , U) population. Let 

(55) E(x r«)=/ir (i= 1,2, •■•,?) 

(' u = 1,2, ■••,17), 


and let the covariance matrix be 1 1 ir, , 1 1 Then the hypothesis is 


(56) 


Hi : fi“ = /i. 


For testing this hypothesis let 


( 57 ) 

( 58 ) 

where 

( 59 ) 


U N u 

k - E £ (*?. - - *n, 

■Umml Q— 1 


a., = ZiW - x t )(x? - £,), 

u-l 


1 

_ 1 V ~ u 

u N u 
1Y 1 a-l 




(«■ = 1, 2, •••, p) 

(u = l,2, •••,17). 


The b„ have n = N — U degrees of freedom and c„ = a,, + b tJ , have iV — 1 
degrees of freedom. Then the N/2 root of the likelihood ratio criterion for Ih 
is W defined by (44) . For this case equation (45) is 


where 


£ 2V“(m“ - /h) (fi“ - w) ~ 


= o, 


P. = 




u 


5 Roy erroneously claims his distribution to hold for the planar case and higher rank. 



430 


T. W. ANDERSON 


Hsu has demonstrated that the general regression problem can be put into 
the form of . Suppose that xu (i — 1, 2, • • •, p; a ~ 1, 2, ■ • ■, N) follow 
a multivariate normal distribution with covariance matrix || <m ||, and let the 
expected value of x; a be 

E(xia) — $2 PtrWra (q < N - p), 

r-1 

where the q by N matrix 

W = || tV,a || 

is of rank q. Let Hi be the hypothesis that 

Hi : Bj = H p, u l|-0 (i = 1, 2, ■ • ■, p; u = 1, 2, • • • , m < q) 
with the id’s known. Let 

Wi = || V)ua j| (u= 1,2, ••',»;« = 1,2, — , iSO 
Wl = Hwr.il 

(r = m + 1, •••, 9; a = 1, 2, N), 

X == || * <B || (i = 1, 2, • ■ p; « «= 1, 2, • • ‘,N). 

Let 

|| hj\\ - XX' - XW'(WW'T l WX 
|| on || = XX' - XW'i( WiWiy'WiX'. 

(with || Cij || = XX’ if W 2 = 0). Then the likelihood ratio criterion for Hi 
is the iir/2-th power of W, defined by (44). 

The equation (46) can be written in terms of Z, B\ , and W as 

(60) |Wi(I - WiiWiW'iT'WiWiB'! - AS | = 0 
for m < q. If m = q, (45) becomes 

(61) | BiWW'B[ - XS l = 0. 

In (60) and (61) there are no more non-zero roots than the rank of f?i . It is 

clear that the roots of (60) (or (61)) depend on the matrix W as well as B x . The 

distribution of A the likelihood ratio criterion under the null hypothesis does not 
depend on the distribution of the matrix W (if W is not constant) . However, the 
distribution when the null hypothesis is not satisfied does depend on or on ki 
and * 2 , and hence, on the distribution of the elements of W as well as the value 
of Bi . 

The special case of Ho for m = q = 1 gives as the likelihood ratio criterion 
as a function of Hotelling’s generalized T 2 . From the moments indicated in (50) 
we can deduce the distribution of T 2 when the null hypothesis is not true [3]. 
This result has been obtained by Hsu [10] by another method. 
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ON HOTELLING’S WEIGHING PROBLEM 1 

By Alexander M. Mood 
Iowa State College 


1. Summary. The paper contains some solutions of the weighing problems 
proposed by Hotelling [1]. The experimental designs are applicable to a broad 
class of problems of measurement of similar objects. The chemical balance 
problem (in which objects may be placed in either of the two pans of the bal- 
ance) is almost completely solved by means of designs constructed from Hadamard 
matrices. Designs are provided both for a balance which has a bias and for 
one which has no bias. 

The spring balance problem (in which objects may be placed in only one pan) 
is completely solved when the balance is biased. For an unbiased spring 
balance, designs axe given for small numbers of objects and weighing operations. 
Also the most efficient designs are found for the unbiased spring balance, but 
it is shown that in some cases these cannot be used unless the number of weigh- 

i large as the binomial coefficient ( . ^ ] or ( , f . 

UP/ \i(P + 1) 


mgs is as I 


UU 

) 


where p is the num- 


ber of objects. 

It is found that when p objects are weighed in N > p weighings, the variances 
of the estimates of the weights are of the order of a/N in the chemical balance 
case {a is the variance of a single weighing), and of the order of 4 a/N in the 
spring balance case. 


2. Introduction. The problem is fully discussed by Hotelling [I] and refers 
to the design of a certain class of simple experiments. We may consider the 
typical example of the class to be that of weighing several small objects on a 
chemical balance or other weighing device. Hotelling and Yates [2] have shown 
that the individual weights may be determined more accurately by weighing 
the objects in combinations rather than weighing each one separately. The 
designs are applicable to a great variety of problems of measurement, not only 
of weights, but of lengths, voltages and resistances, concentrations of chemicals 
in solutions, in fact any measurements such that the measure of a combination 
is a known linear function of the separate measures with numerically equal 
coefficients. The designs should be particularly useful in biological and chemical 
laboratories engaged in routine chemical analyses. We shall, however, in the 
interest of simplicity, discuss the problem in the language of weighing operations. 

A particular design is denoted by a matrix. The three objects to be weighed 
in four weighing operations may be weighed by the following design: 

1 Journal Paper No. J-1405 of the Iowa Agricultural Experiment Station, Ames, Iowa. 
Project No, 890. 
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1 

1 

0 

1 

0 

1 

0 

1 

1 

1 

1 

1 


where the rows refer to weighing operation^ and the columns refer to the objects. 
In the above design the first two objects are weighed together in the first weigh- 
ing operation, the first and third objects are weighed together in the second 
weighing operation, etc. From the four resulting weights the individual weights 
are estimated by the method of least squares The design problem consists of 
finding matrices which will minimize the variances of these estimates. 

There are two distinct though closely related problems here. One is to find 
efficient designs for the case in which the measure of a combination can only 
be the sum of the individual measures. This would be the case, for example, 
in weighing objects with a spring balance and we shall refer to it as the spring 
balance problem. The other problem is to find designs when an individual 
measurement may be either added or subtracted in a combination. This would 
be the case in weighting objects with a chemical balance (since an object may 
be put m either pan of the balance) and will be called the chemical balance prob- 
lem In the latter problem the design matrix may contain 0’s, l’s, and — l’s, 
whereas in the spring balance problem the matrix may contain only 0's and l’s. 

We shall use Hotelling’s notation. There are p objects with weights b { , 
h 2 , • •, b p to be weighed in N > p weighing operations. The design matrix 

is denoted by 

(1) , x = ||*«||« - i, •••»#;* - i» •••>?■ 

Denoting the transpose of X by X', let 

(2) X'X = IK, II = iK’ir 

(3) Qx = ^ 1 &ai Va 

a 


where y a is the observed result of the a-th weighing operation. The least Bquares 
estimates of the b x are 

(4) &, = £ a li g, 

J 


and the variances of these estimates are 

( 5 ) 


.< 2 
= a <r 


where a 1 is the error variance of a single weighing operation. The a" will be 
called variance factors. 

Hotelling’s main theorem states that or any design, a” > 1/N, hence the 
best possible design is one such the inverse of the product of the design matrix 
by its transpose has its main diagonal elements equal to 1/N. We shall call 
such a design an optimum design. Examples show that optimum designs do 
not exist for all values of N and p. 
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When an optimum design does not exist, the question arises as to how a 
best design shall be defined. In the present paper a design will be called best 
if the determinant of the matrix || a' 3 j| is minimized. A best design in this 
sense is, therefore, a design which gives the smallest confidence region m the 
hi (z = 1, 2, • • • , p) space for the estimates of the weights. 

In certain situations, other definitions of beat designs may conceivably be 
preferred. Thus, problems may arise in which one might prefer: 

(a) to minimize the variance factors subject to the restriction that they be 
equal, (b) to minimize some function of the variance factors, or (c) to minimize 
only a certain subset of the a'* on a minor of the matrix |j a' J || as might be the 
case when one wanted only rough estimates of the weights of some of the objects, 
but accurate estimates of the others. 

When an optimum design exists, the confidence regions are not only minimized, 
but, as Hotelling has shown, the variance factors are also minimized. It is not 
true in general, however, that a best design as here defined (minimum confidence 
regions) will also minimize the variance factors. Examples illustrating this 
point are given in the last part of section 6 and the first part of section 7. 

3. Hadamard Matrices. The problem of finding the best designs is closely 
related to the Hadamard determinant problem. Hadamard [3] proved the fol- 
lowing result: If the elements a:*? of a square matrix X are restricted to the range 
— 1 < z a p < 1, the maximum possible value of the determinant of X is N iN , 
and when this maximum is achieved all ~ =fcl and the matrix is orthogonal 
in the sense that X'X is a diagonal matrix; the non-zero elements of X'X are 
all equal to N. A matrix X which satisfies these conditions will be denoted by 
Htr . Obviously if II x exists for a given N, it is the solution of the design prob- 
lem in the chemical balance case when N = p. 

With regard to the existence of Hx , it is known that a necessary condition is 

N = 0 (mod 4) 

with the exception of N = 2. It is not known however whether the above 
condition is sufficient, although it is known (Paley [4]) that Ha exists for the 
range 

0 < 4 fc < 100 

with the possible exception of 4 k = 92. Paley and Williamson [5] give methods 
of constructing Hu in the given range (excepting 92) based on the theory of 
finite fields. 

When A is a power of two, Hx is easily constructed by taking direct products of 



Thus 



hotelling’s weighing problems 


435 


Sylvester [6] first studied, this class of matrices and Kishen [7] has described 
weighing designs based on this subset of the H N . 

The following examples of Hadamard matrices may be found in the literature: 
Paley [4] exhibits an Ha , Hu , and Hu : Kishen gives an H w . From these 
examples Hu and H& may be constructed at once from the direct products 
Hi -Hu and Hi -Hu ■ The following is an H 20 : 


4 

— 

— 

— 

— 

+ 

— 

— 

— 

— 

4 

4 

— 

— 

+ 

+ 

— 

4 

4 

— 
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— 

— 

— 

+ 

— 

- 

— 

4 

4 

4 
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— 

— 
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where the signs represent ±1. This example was constructed by Williamson’s 
method [5], Thus examples of H ik for the range 4 < 4fc < 32 are immediately 
available and methods of construction exist for the range 36 < 4fc < 88. 


4. Chemical Balance Problem. When N = 0 (mod 4) an optimum design 
exists if exists and is obtained by using any p columns of H K . When 
N # 0 (mod 4) we may construct very efficient designs as follows: If JV = 1 
we may add a row of ones to H^~i ; if N = 2 we may add two rows of ones or 
a row of Hi’s to Hn ~ 2 ; and if N = 3 we may delete one row from H N + 1 . The 
worst of these designs will be obtained when two rows of ones are added to an 
Hy~i , and in this case the variance factors are 


1 N + 2p - 4 ^ 

N — 2N + 2p — 2 < N — 2 


Since it is known that these factors must be greater than 1/N for the best 
possible design in this case, the above design will be quite near the best design 
for large N. 

For small values of N we shall consider only the case N = p, since if one 
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wanted to make N > p weighings, lie would normally choose N to be a multiple 
of four because the gam in efficiency by using optimum designs is rather large 
for small N. In general more than p weighings would be required because a 
is not usually known. Thus several additional weighings may be made in 
order to obtain several degrees of freedom for estimating a . 

When II \ does not exist we have already defined the best design as one which 
minimizes the confidence region for estimating the weights; that is equivalent 
to maximizing | a,, | or minimizing | a’ (. There may be several designs with 
the same minimum, but we shall not give all of them. Thus when p = 3 the 
best designs are 

++o + + + !++- 

X = + - + , + — + and + — + 

- + + — + + — d-d- 

all of which have A = 10 (which is considerably smaller than the value 27 
that A would have if an optimum design existed) . Using the notation 

(a") - (fl u , a 22 , • • , a’^), 
the first of the above designs for p = 3 gives 

(O = (t, i i) 

wliilc the second and third give 

(a") = (*, h 4 ). 

For N = p = 6, two best designs are 


d- 

+ 

d~ 

+ 

— 
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d- 

“k 

d- 

— 

d - 

+ 

d- 

d~ 

— 

— 

d- 

+ 

— 

+ 

d- and 

d~ 

— 

d~ 

— 

d~ 

d- 

— 

+ 

d- 

+ 

+ 

— 

+ 

+ 

— 

— 

+ 

+ 

d - 

+ 

d~ 

d- 

— 

+ 

d- 


both of which have 

A = 3 2 2 8 and (a") = (2/9, 2/9, 2/9, 2/9, 2/9) 
For N = p = 6, a best design is 

d- 

d- d- + 

V— d- — — d- d- — 

d- - d- + - d- 

+ d- + - d- - 

+ +- + - + 

which has 

A = 5 2 2 10 and all a" = 1/5. 
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For N = v = 7, a best design is 
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which has 

A = 2 12 3 4 and all a" = 1/6 

These designs were constructed by a method due to Williamson [8] which 
will be described in sections 5 and 7 It is interesting to note that no minor 
of an Hi is a best design for N = p = 7, for any minor of an Hs gives 
A = 2 19 < 2 12 3 4 and all a" = J 

5. Spring Balance Problem. N — p = 4k + 3 When N = p and N = 3 
(mod 4) the best possible design for the spring balance case is determined by 
H n +i if it exists Let Kv+i denote a matrix formed from H N+l by adding or 
subtracting the elements of the first row of Hn+i from the corresponding ele- 
ments of the othei lows in such a way as to make the first element of each of 
the remaining rows zero Obviously 

| -Kjv+i I = ± | H/r + 1 | 

and excepting the first row, the elements of Ky+i are 0 and ±2 with the signs 
of the non-zero elements the same for elements in the same row. Let L N be 
the matrix obtained by omitting the first row and column of Kn+i , by changing 
all non-zero elements to + 1, and by permitting two rows if necessary to make 
the determinant of L N positive. Then 

| Hu. |,i | = 2 Y | Lff | 

and it is clear that, given , one could reverse the procedure and determine 
an H n+ i . In the same manner, there is a correspondence in general between 
square matrices with elements ±1 and square matrices of one less order with 
elements 0 and 1 The ratio of the values of corresponding determinants is 
always 2 N if their determinants do not vanish, hence the 0,1 determinant will 
have its maximum value when its corresponding +1 determinant has a maxi- 
mum value. Thus | L N | is the maximum value possible for a determinant of 
0’s and l’s of order N, and the value is 

(7) | L n | = (N + i) 1(y+1) /2* r . 

The variance factors are 


a” = 4 N/{N + l) 2 . 
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We knew in advance, of courae, that the a" would be greater than l/N since 
an optimum design cannot exist unless the design matrix has its elements equal 
to dbl, and we must here restrict the design to have only 0 and 1 as its elements. 
Since L# is a best possible design for the spring balance case, it follows that 
designs for the spring balance problem can be no more than about * as efficient 
as designs for the chemical balance problem. 

6. Spring Balance N > p. When N > p the device used in the chemical 
balance case to get optimum designs cannot be used. For if we select p columns 
from an Lit we may get rows of zeros which would waste weighing operations. 
A different approach is necessary and a clue is given by the designs L N . In 
these designs p is odd and the objects are weighed i(p + 1) at a time in each 
weighing operation. We shall show in general that objects should be weighed 
$(p -f 1) at a time when p is odd, and we shall obtain a corresponding result 
for p even. 

Let P, be a m itrix whose rows are all the arrangements of r ones and p — r 
zeros (0 < r < p). (The symbol should also have a subscript p but that is 
omitted because any specific value for p will always be clear from the context,) 

The matrix will have p columns and rows. Let Q be a matrix made up of 

matrices P T arranged in vertical order. Let n, be the number of times P T is 
used in constructing Q. Q is a weighing design for p objects and 



weighing operations. The matrix Q'Q will have diagonal elements 

(9) a = L (j I 

and non-diagonal elements 

o» (?:’)• 

The determinant of Q'Q is 

A = (a - 6) p-i [a + (p - 1)6 
and we may write A in the form 

A = c p_1 d 


where 

(ID 


c = a — b, and d = a -f (p — 1)6. 
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We shall prove the following theorem: 


If p = 2k — 1 where k is a positive integer, and if N contains the factor 



then A will be maximized when n k = N 



and all other n r = 0. 


We shall demonstrate this statement by showing that if any n, (s y* fc) is 
decreased and n k is increased in such a way that N remains unchanged, then A 
will be increased. Let n , be reduced by an amount m so chosen that 


m! = to 



is an integer; we may then increase by m' leaving N unchanged. It is readily 
found that these changes in n, and n k produce the following changes in c and d : 


Ac = m 



Ad 



(k — s)(k — s — 1) 
P(P - 1) 

(k — s)(k + s) 
p 


both of which are positive on zero when s < k and A is necessarily increased. 

When s > k, Ac is positive but Ad is negative and it must be shown that the 
net effect of these changes is to increase A, we shall assume now that n T = 0 
when r < k. 


A A = (c + Ac) r ~\d + Ad) - c p ^d < [c* -1 + (p - l)c p ~ 2 Ac](d -f Ad) 

- c*- l d < c^tcAd + (p - l)dAc + (p - l)AcAd] 


where in the second line we have omitted terms in Ac of higher order than the 
first. These terms are all positive since all their fetors are positive. The 
bracket in the last expression on substituting from (9), (10), and (11), may 
be reduced to 


m 






+ p Q(* “ s)\k + s)(k -*-l)], 


and then to 


- 0[S"- (r : > - •> — 

+ P (§(k-s)\k + S )(k-8-l)]. 

Each term of the sum in the bracket is greater than or equal to zero when k > 1, 
r > k, s > k since the fraction is readily seen to be negative or zero under these 
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circumstances. The fraction vanishes only when k = 2, r = k, s = k+ 1. 
The other term in the bracket is negative but it is dominated by the term in 
the sum for which r = s, as may be shown as follows: The two terms in ques- 
tion may be written 


n„ 


C:!>- 


s) 


s(k — s) 4- k — s 4- 1 


V - 1 


+ 


(:=D 


— 1 \(k — s)*(k + s)(k — s — l) 


and since n, > m, this expression is less than or equal to 


m 


_ ;)» - ») [ 


s(k — s) + fe — s + 1 , (fc 2 — s 2 )(* — s — 1)' 
— + 


p - 1 


ps 


which is positive for s > k since the bracket is negative as may be seen by 
factoring out 


p(p - l)s 
(fc - s + l)(s 2 + (P 


and putting the result in the form 

l)fc 2 ) — pfc(p — s) + (2s + l)(fc — s). 


Thus A A has been shown to be positive and the theorem is proved 
The above argument has shown that P* or repetitions of Pi give more efficient 
designs than any other combination of the designs Pi, P 2 , • • , P* The ques- 
tion now arises as to whether these are the best possible designs. We shall 
show that they are by considering the matrices L„ of section 5 which are known 
to give the greatest efficiency in the spring balance case. Let p = 4t + 3 

and let N = G t + 2)’ and suppose L p exists (i.e. i exists) . Using P 2t 

as the weighing design we find the an are 


a tl = 2 N(t + l)/p 
a„ = N(t + l)/p 

A single application of the design L v gives 

= 2 (t + 1) 
a,, — t -f- 1 


i 7 * j. 


i ?= j 


and N/p repetitions of L r gives an ai, matrix with elements equal to N/p 
times the given elements for one application of the design. The two designs 
are therefore equivalent and P 2 < is a best design. 

The variance factors for repetitions of the design Pk are 


( 12 ) 


_ 4 

C6 T4 


v 


n (p — 1 y 


N m 0 Mod K 
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and these are minimum variance factors 2 as may be shown by an argument 
entirely analogous to that used in proving the theorem. Thus Pk is a design 
which not only minimizes the confidence region for estimating the weights, 
but also minimizes the individual variance factors. 

Efficient sub-matrices of the Pk have not been studied except for small p, 
but we may point out that square sub-matrices of order p which are as efficient 
as Pk do not exist unless H exists, for by the argument of section 4, it is pos- 
sible to construct H p +1 from such sub-matrices Hence we cannot obtain vari- 
ance factors as small as those given by equation (12) when N = p unless I/„ n 
exists. 

The situation here is analogous to that in the chemical balance case. By 
a proper selection of N we can obtain a design with the maximum possible ef- 
ficiency for any odd value of p. But here we are much more restricted in our 
choice of N. In the chemical balance case N could be any multiple of 4 for 
which an II N existed; in the present case N must be a multiple of p even in the 
most favorable instance (p = 4< -j- 3), and for some values of p it may be neces- 
sary that N be a multiple of 

We now turn to the case in which p is even. The theorem corresponding 
to the one given at the beginning of this section is: 



If p = 2k where k is a positive integer, and if N contains the factor 



then A will he maximimized when 




and all other n r = 0. 

We shall not prove this theorem in detail. By arguments analogous to those 
used earlier, it may be shown that A is increased when either n, (s < k ) is de- 
creased and nj , . is increased, or n, (s > fc + I) is decreased and nk+i is increased 
with N fixed. This done, we may put all n r = 0 except nk and nt + 1 and then 
maximize A with respect to these two variables subject to the condition that 


n k 



+ 



= N. 


The values of n k and n*+i which maximize A may be found by treating them as 
continuous variables and using the calculus. 

The variance factors for these designs are 


(13) 


4 p 
Np + 2 


N = 0 mod 



a The author is indebted to a referee for suggesting this property of the design, and 
for several other valuable suggestions and corrections to the paper. 
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but these are not minimum variance factors. In fact one can obtain smaller 
variance factors than these by using only Pt in the design (omitting P* +1 en- 
tirely) . In this case 

(14) a " = - ~ — - N = 0 mod 

N p z 

and 



(g ~ 1)' + 1 V 

f p -h 2 


when p > 2. 


We have not found explicitly the design which minimizes the variance factors 
for p even, but it appears that the design would be made up largely from P* 
with a small proportion of the design devoted to Pk+i . Thus (14) is very 
nearly the minimum possible variance factor. 


7. Spring Balance Designs for Small p. When p = 2, each object may be 
weighed r times by itself, and the two objects may be weighed together s times 
to give 


II*. II 


r -j- s $ 
s r -f- s 


and if A is maximized subject to 2r -f s — N we find 


r = s = N / 3 


o" = 2/N 


provided N is a multiple of 3. The most efficient basic design is therefore 


X = 


1 1 
1 0 
0 1 


in accordance with the previous section. When N is not a multiple of 3 the 
best design is obtained by using the first row of X for the odd weighing when 
N = 3< + 1, and the last two rows when N = 3t + 2. 

The case p — 2 is notable in that there is almost nothing to be gained by 
weighing the objects in combination. For the variance factors 2/N would 
be obtained by simply weighing each object separately Nf 2 times. The ad- 
vantage of weighing in combination is only that square confidence regions in 
the bi , hi space are replaced by ellipses with somewhat smaller area. If a" = 
( r + s)/(r 2 + 2 rs) is minimized subject to 2 r A- s = N, we find 

t = N{3 — V3)/3, a u = 1.866/W 

so that the a u are reduced slightly from 2/N but at the expense of increasing 
the area of the elliptical confidence regions. 
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For p = 3 the most efficient design when N = 3 is 


X = 


1 1 0 
1 0 1 
0 1 1 


as given by L s or P 2 . It is easily shown that for N > 3, the most efficient design 
is given by repeating X even when N ^ 0 (mod 3) . Thus for N = 4 we would 
repeat one row of X, for N = 5 we would repeat two rows of X, and so forth. 
The variance factors are 


a 


44 


_9_ 

41V 


N = 32 


9 (IV + 1) 

4(IV - 1)(IV + 2) 
9 (IV + 1) 

4(IV - 2) (IV + 1) 


N = 3< + 1 


N = 3f + 2. 


For p = 4 we may attempt to find by trial and error a sub-matrix of the 
design given by using P 2 once and P 3 once, but this would be a tedious process 
and the labor would soon become prohibitive for larger values of p. Hence 
another method must be found for obtaining the best designs when N = p 
except when Z/„ exists A method is provided by Williamson [8], Let I)„ 
be the best design for IV = p. Williamson shows that when p < 7, L> p _i is 
a minor of D p , hence D p may be found by adding a row and column of variables 
tp D p - 1 and expanding the determinant of the result by the bordered expansion. 
For small values of p it is easy to determine by inspection what values the 
variables should have in order to maximize the resulting expansion William- 
son determined D\ and D 0 by this method 
There are two types of D 4 which give a maximum value of A = 9 


1110 


10 0 1 

110 1 

and 

1110 

10 11 


0 0 11 

0 111 


0 10 1 


The variance factors are all 7/9 for the first of these, and for the second 

(a”) = (7/9, 7/9, 7/9, 4/9). 

When N = 5, p = 4, there are a number of designs which give a maximum 
A of 19. None of these however has all a” equal, and we shall give only one 
example: 


1 

0 

0 

1 

1 

1 

1 

0 

0 

0 

1 

1 

0 

1 

0 

1 

1 

1 

0 

0 


X = 
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which has 


(O = 


/12 12 13 _8\ 
\19’ 19’ 19’ 19/’ 


When. iV = 6, there appears to be no design superior to P 2 . It has variance 
factors all equal to 5/12 and A = 48, — a very large gain in efficiency over 
N — 5 at the expense of one additional observation. 

When p = 5 there are three types of Z) 6 which give A a maximum value of 25, 
none of which has all variance factors equal. An example is 


with 


0 

0 

0 

1 

1 

0 

0 

1 

1 

0 

0 

1 

1 

0 

1 

1 

1 

0 

1 

0 

1 

0 

1 

0 

1 



19 19 16 U 16\ 
25’ 25’ 25’ 25’ 25/ ' 


For p — 6, an example of a D 6 with all a' 1 equal which maximizes A is 


l 

1 

1 

0 

0 

0 

1 

0 

0 

0 

1 

1 

1 

0 

0 

1 

1 

0 

0 

0 

1 

1 

0 

1 

0 

1 

1 

0 

1 

0 

0 

1 

0 

1 

0 

1 


with A = 81 and a" — 17/27. This example was constructed by the bordered 
expansion method from Ds and it turns out to be a sub-matrix of P s ■ It is not 
as efficient as P s , however, since substitution of N = p = 6 in equation (14) 
gives a" = 13/27. Hence we have shown that there does not exist a minor of 
P» (for p = 6) of order 6 which is as efficient as Pi itself. 

For p — 7, there is a most efficient design given by L 7 , 


1 

0 

1 

0 

1 

0 

1 

0 

1 

1 

0 

0 

1 

1 

0 

0 

0 

1 

1 

1 

1 

1 

1—1 

0 

0 

1 

1 

0 

0 

1 

1 

1 

1 

0 

0 

1 

0 

1 

1 

0 

1 

0 

1 

1 

0 

1 

0 

0 

1 


with A = 2 U and all a" = 7/16. 

D p for p = 8, 9, and 10 could presumably be constructed from Li in the same 
way and the designs for p = 4, 5, and 6 were constructed from L % , but the 
computations become very tedious for these larger values of p. 

The designs given in section 3 were constructed from the above designs by 
the method described in section 4. 
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8. Bias in Measuring Devices. In some kinds of experiments it may be 
necessary to estimate a bias in the measuring scale in order to estimate the meas- 
ures of the objects Such a bias may simply be regarded as an additional 
object to be measured except that it is an object which must be included in all 
the measuring operations. In the chemical balance case the bias presents no 
difficulty, for if an H N exists, then there exists an H u with a column whose ele- 
ments are all + 1 . Such an II may be constructed from any given Hu by merely 
changing the signs of all elements in rows which begin with a minus sign. The 
result will be an H N with + l’s in the first column and that column may be 
assigned to the bias We note that the gain m efficiency by measuring objects 
in combinations is even greater m the case of a biased measuring scale than when 
there is no bias For if the objects were measured separately, their measures 
would be estimated by the difference of two scale readings and would have vari- 
ance 2a- 2 ; hence the variance factors a" are to be compared with 2 (rather than 
1) m the case of bias. 

In the spring balance case, the additional restriction that all the elements of 
one column be one necessarily reduces the efficiency of the designs in the sense 
that the variance factors for p objects and a bias will be larger than the variance 
factors for p + 1 objects without bias. When the measures of p objects and 
a bias are to be estimated from N = p + 1 measuring operations, a best design 
may be obtained by adding a row of zeros and a column of ones (m that order) 
to the best design for N = p without bias This can be seen by recalling that 
there are two determinantal exressions for the volume of a simplex with one vertex 
at the origin in a Euclidean p space (A simplex (Sommerville, [9]) is a polytope 
with p + 1 vertices bounded by p + 1 (p — 1) -dimensional hyperplanes ) The 
determinant of the best design for N = p (without bias) is proportional to the 
volume of the largest simplex with one vertex at the origin and the other vertices 
restricted to be selected from the vertices of the unit cube. A determinant of 
order p + 1 with a column of ones and the other elements zero or one also gives 
the volume of a simplex with vertices selected from the vertices of the unit cube. 
Hence the two determinants (one of order p and one of order p + 1) must 
have the same maximum value, and as one of the vertices may be selected ar- 
bitrarily in the case of bias, we may select the origin. 

In general, for N > p, similar geometrical reasoning will show that the best 
designs for the spring balance problem in the case of bias are easily constructed 
from Hadamard matrices as described in the following theorem : 

If X is a best design for the chemical balance problem in the case of bias and if X 
contains a row of + l’s, then a best design for the spring balance problem m the 
case of bias is gnev by replacing the — l’s inXby zeros. 

We have seen that the best design in the chemical balance case is obtained 
from a Hadamard matrix with a column of -f l’s. Obviously the matrix may 
be also made to contain a row of + l’s by changing the signs of certain columns. 
The design X consists of the column of ones together with any other p columns 
The determinant of X'X is l/p ,s times the sum of squares of the volumes of 

a set of simplexes in a p space. There are ^ ^ of these simplexes deter- 
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mined by the different combinations of the rows of X taken p + 1 at a time, and 
the coordinates of their vertices are the last p elements of the rows of X. The 
vertices are therefore selected from the vertices of a cube in the p space which 
has its edges parallel to the coordinate axes, the origin at its center, and the 
lengths of its edges equal to two. Since X is a best design, the vertices are 
selected so as to maximize the sum of squares of the volumes of the simplexes. 
Now in the spring balance case we must maximize the sum of squares of the 
volumes of a set of simplexes which have their vertices selected from the vertices 
of the unit cube. Obviously this may be done by selecting vertices correspond- 
ing to the selection given by X. Thus it is necessary only to set up a cor- 
respondence beteen the vertices of the two cubes Since X contains the vertex 
(1, 1, 1, 1, ••■,!) which is common to both cubes, the natural correspondence 
which identifies a vertex such as (1, — 1, — 1, 1, — 1, l, • ■ ) with (1, 0, 0, 1, 
0, 1, ■ ■ •) may be used. 

The variance factors for these spring balance designs are 4 /N (for any p < N) 
when N is a multiple of four and H N exists; when N is not a multiple of four 
and modifications of H N as described in section 3 arc used, the variance factors 
will differ from 4 /N by terms of order 1/iV 2 . 

9. Addendum. After this paper was written, the paper of Plackett and 
Burman on “The Design of Multifactonal Experiments” appeared in Biomelrika. 
Volume 33 (1940), pages 305-325. A part of this paper discusses Hadamard 
matrices much more completely than we have done in section 3 In particular 
Plackett and Burman have constructed all Hadamard matrices of order less 
than or equal to 100 (excepting 92) . 
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THE APPROXIMATE DISTRIBUTION OF STUDENT’S STATISTIC 

By Kai-Lai Chung 

University of Peking , Kunming, China 

Summary. It is well known that various statistics of a large sample (of 
size n ) are approximately distributed according to the normal law. The asymp- 
totic expansion of the distribution of the statistic in a series of powers of n -! 
with a remainder term gives the accuracy of the approximation. H. Cramer 

[1] first obtained the asymptotic expansion of the mean, and recently P. L. Hsu 

[2] has obtained that of the variance of a sample. In the present paper we 
extend the Cram6r-Hsu method to Student’s statistic. The theorem proved 
states essentially that if the population distribution is non-singular and if the 
existence of a sufficient number of moments is assumed, then an asymptotic 
expansion can be obtained with the appropriate remainder. The first four 
terms of the expansion are exhibited in formula (35) . 

1. In a fundamental paper 1 P. L. Hsu [2] has devised a method for obtaining 
the asymptotic expansion of the distribution of various statistics. The present 
paper deals with the so-called Student statistic. 

Let 


£ i > £j » ■ ■ ■ ) £n 


be n independent random variables having the same probability distribution 
represented by a distribution function P(x). The rth moment and rth absolute 
moment are denoted by a T and ft respectively. It is assumed that — 0 
and that for a certain k 5: 3, ft < °° and that aj > 0. Hence there is no loss 
of generality in assuming that as « 1. 

Student’s statistic is defined as 


n \ -1 

E & - in 

r=l I 

, n(n-l) / 


where £ = - E • 
n t=i 


For brevity, we consider 

»i(t (e. - i>’) H 


Let its distribution function be denoted by F(z), i.e , 

Pr jnf (Z (&■ - ^ 2 } = F{t). 


1 The definitions of the various constants A, Ajt , Qk , Aj , ft 0, , are the same as 

in Hsu’s paper. 
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Discarding the case k = 3 where we can prove a more precise result and the 
singular case which can be shown to admit no asymptotic expansion in the sense 
of Cramir [1], we shall prove in this paper the following theorem : 

Theorem. Let P{x) be non-singular and a 2 k < for some integer fc k 4. 
Then 

(1) F(z] = *(*) + xOO + B(*), m =^jT dy, 

where x( 3 ) « a linear combination of the derivatives 4>'(s), ■ • • , $ (3t ~ 10) ( s ) eac j L 
coefficient of the form n -l '(l £ v £ k — 3) fimes a quantity depending only on a s , 
• • • , ak-i whose beginning terms are given in (35) and where 

(2) | R(z) | g (2*(1 + 1 3 r~V“°, od = 

where Q k is a constant depending on k and P(x)f 
We shall need some of Hsu’s lemmas, i.e., his lemma 3, lemma 7 (both for 
the particular case m ~ 2) and lemma 8, These we shall quote with this num- 
bering. The application of Hsu's method to Student’s statistic depends on the 
following lemma. 



2. Lemma A. For u ^ — 1, Z S: 1, we have 


l + £ - * - f i + £ , ( ' 1) ' r & > 

M r (| - j) r (j +i) \ /_1 r (| - j) r (j + i) 


^ Vl + U g 1 + £ -nr ^ 

" l r (l - j ) r(j + 1} 

Proof. By Taylor’s expansion of \/l + u, we have 



whence it follows that (1 + is finite, and positive. The right- 

hand side inequality follows. 
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Similarly, if u ^ 0, 


21— I 

vTT^ & 1 + z 


G) 


2Z-1 

ii+S 


"‘rd-^ro + i) 

r (D 


+ 


ffl 


( 1 -) 


T(2l + 1) 


’"r(|-j)ro + i) 


2i—i (— i) ; r 

|i + E 


” 1 r(|-i)r(i + 1 ) / 

since by a well-known result on the binomial theorem we have 

(-’Mi) 


1+S KH 

For — 1 ^ u < 0, we have 


= Vl - 1 = 0 . 


r(j + 1) 


21-1 

1 + z 


(i) 


I " 1 r(|~i) r( i+ 1) 

For — 1 £ u < 0, we have 

(i) 


3 /r-r— N 

u - VI + u = ^ , say. 


D 


21 — I 

D = 1 + E 


,_1 r(|- j)r(j + D 


w J + V , "l + 


■w 


21 — 1 

^ i+ E 




Next, 


,_1 r (l"j) r 0 ' + 1 ) 


21— 1 

= u+ z 


is a polynomial in u of the form 

u (cio -j- aiw -f- ■ • • -f- fl 2 iw Ji ) 


u’ \ - (1 + u) 


where a<) > 0 and the successive coefficients have alternating signs; hence for 
-1 < « g 0, «o + a lit + ■ • ■ + a 2 iu il assumes its maximum at u = —1. This 
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maximum is obtained by putting u = — 1 in the numerator, hence for — 1 § 

u < 0, 


21—1 ( 1)' 
NZu* 1 + E 7 ~ ; 


g ( - i),r (D 

y_1 r (| - j) r(j + l) 


The left-hand side inequality in the lemma now follows. 

For brevity we write the inequalities as 

(3) 1 + P 2l (u) = 1 + Pu-i(u) — bu u n ^ \/ 1 u g 1 + Pu-i(u), bn > 0. 


3. We write 


E (fr “ 0* = E fr - »? = » + Vnfa - 1) X - Y\ 


where 


r - 1 V — 1) 

Then Student’s statistic may be written as 

nf(t <f, - 1) 1 )" - r(x + y'ap X - 0*. 

Then, for every z, we have 
F(z) = Pr |f(i + X - Y 2 Y S z} 

- Pr {\^+l YS ‘W 1 + \^°- ± T lx . 


For brevity let 


l + ~7= F, 
n 


04-1 


X = U. 


Suppose z ^ 0; then we have by (3), 

2 + zp 2 i~i(U) £ z VT+U ^ 2 + zPn(U) 

Pr{V gz + zP tM (XJ) ) g F{e) £ Pr{V g 2 + zP 2l (U)} 

Suppose a > 0; then we have by Lemma A a similar inequality with the 
extreme terms exchanged. 

Now we take l = j^j, and fix it henceforth. 
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Our next step is to obtain an asymptotic expansion for 
Pr{V g z + sP m (U)} = Pr jy < z (l + 


with m = 21 — 1 or 21, l ^ 1. 
Let b be any real number, and 




! ( i + 0 ” p -( v /5 ? x )- l " (x) - 

Until section 12, we shall write simply L(x ) for either of the L m (x). 

4 . Let W be the probability function of the distribution of the random point 
(X, Y) and let f(ii , h) be the characteristic function. 

W (S) = Pr{(X, Y)eS\ for every Borel set S in St 2 

1)“ 


Pit i ,k) = f 

J—a 


: dP. 


Then 


(5) Pr{Y £b + L(X)} = Jf dW = jj dW + j_ [ G(x, y) dW 
where 


VSM-L(x) 


V5!> 


G(x, y) = < - 1 
. 0 

We approximate G{x, y) by H(x,y), where 


if b < y g b + L(x), 
if b + L{x ) < y ^ b, 
otherwise 


H{x, y ) = l-e~ n 
> 0 

We approximate dW by (w(x, y) + y(x, y)) dx dy, where 

w(x,y) = Q-jz J J e~' llx ~ ltlV <t>(di,h)dtidbi 


if 5 < ?/ S 6 + L{x) 
if b + L(*) < y £ b 
otherwise 


— 00 *1—03 
ft CO ft oo 


y(x,y)= - 1 -, [” r e-' lit - ihv <l>{t,A)^^)dtidk 

(2iTry J-oo ‘/-w 

„ - s(5=D - 7=5 
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and f(?<! , ik) is given in Lemma 3 by taking therein ft 

being any of the f.’s. 

We write 


e ~ i 

\/ on — 1 1 f 


( 6 ) 


[ f 0(x,y~u)dW 

J—OO { c 

/ •c 

j_ (Gfalf -U) - H(x,y - «)) dP 7 


L L ^ x,v u) ~ H( - X >y - w)) (w(^,y) + 7 (x,y)) dy dx 

/ X fX 

/ H{x,y ~ u) dW 

00 •»«“ DO 


“ /„ / «)(»(*,») + 7(*,»)) dydx 


/ " r X 

„ ~ u ) ( w ( x ,y) + y{%,v)) dy dx 


5. We have 

I G ( x . V -u) - H( x , y - u) | S 1 - e"'*’ 1 g «x 2 ‘ 

(8) 1 1 . L (G(X ’ V ~ U) - H ( X >V ~ w » dW | £ £ es 2 ' dW = £jE(xSI) g Qk( 
since 


E(z!,> ^‘ E (vS=r)“se. 

where Q fc depends on a, , a ik ■ 

Similarly, 

(9) I /. L k ~ “) “ #(*>» - u))(w(x,y) + y(x,y)) dydx 

Next, 


^ Q*« 


1. /. ~ »)(«(*^) + y(*, 10) (fydx 

(10) // W X >V)+y(*>V))dy<h- f J (W(x,y)+y( x> y)) d ydx 

«S“+Hi( i) ^ J 


!/£u+i 


wto-e the first tern on the right-hand side, regarded as a function of »-* has 
a Taylor exp.ns.on m powers of n*. whose first few to™ we shalUomou to Z 
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section 9; for the present let us denote it by B(u + b) + C(u + where 

C = C(u + b) is a constant depending on k, P( x) and z, a more explicit estimate 
of which will be given in section 10. 

Further, we have 

(H) j[”° [ G(x,y -u)dW = f f dW- f j dW 

l)Su+5+l(t) I/gu+b 

by Cramer’s asymptotic expansion for the mean V nY, and as is also shown 
in Hsu’s paper we have 

(12) / / dw - / / (w(x,y) + y{x,y))dydx=A k n~ i{k ~ 2) 

y^u+b v£u+b 

Collecting all the results from (5)— (12), we get 

f J dW - B(u + b) - C(u +b )n“ Ufc - 2> 

1/S u+l+L(*) 

= A*(« + n^~ 2) ) + I" I* H(x,y - u) dW 

•i — oo J— ec 




H(x,y - u)(w(x,y) +y(x,y)) dydx 


Now we use A. C. Berry’s weighting factor cosJF anc j obtain 


tr 


1 ( j j dW — B(u + b) — C(u + b)j du 


y Ja !*+!>+£(*) 

= AfcT(« + r^ ic ' ! “ 2, ) 


(13) 


+ f 1 - T~ (f f - U) dW 

•Loo vr J-oo 

-a: H(x,y - u)(w(x,y) + y(x, y)) dydx^du 


since 


f L~ "**?du- r T. 

JL„ u 2 

6. To transform the triple integral on the right-hand side of (13) we use the 
Fourier transform as Hsu did. 

Let 



6 -,o* a * v H( x , y) dy dx = h(h ,h ) ; 
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r r 

J — gg iL>(Q 


-itiX-UiV 


r e~ >tlX - ihv H(x,y - u) dy dx = e~ itiU h(k , k) 

J- — oC d— CQ 

(jf H(x,y - u) diij dy dx 


f Lt ™ Tu 6 -“>» du = . A 

*1 — co 'li? Q 


_ WiT I k | )h(ti ,k) if | k | '< T 
\o otherwise 

ir(T - \ k |) if \k | < T 


otherwise 


By Fourier inversion we have, almost everywhere, 


— cos Tu 
u 2 


H{x, y - u) du = ~ ^ £ e"‘* + " lV (r - | phfo, fe) dt 2 dh 


Hence 


r i “ c °st« r /■“ x , 

I Tfi / / #(*> V ~ u) dW du 

•*-00 li J- 05 J_ oo 

— Jjj; L K f_ T I k > k)h(k, k) dk dt\ 


Similarly we obtain 


r°° i — . cos / r“ f” \ 

i-oo u 2 y ” u )( w (*, 1/) + 7(*> 1/)) dy dx) du 

(15) J " ' 

~ 4ir L„ L T ^ ~~ ^ ^ > 4 *^) }^(h , fe) dfe d<! 

From (14) and (15) we obtain 

r i - cos Tu ( r <•* 

L — * — [L L H(x> y-^ dW 

~ j I H(x, y — u)(w(x, y ) + y(x, y )) dy dx)du 

(16) 50 ' 

=1£O t - 

— <£(h , i 8 )(l + ^(ih , it 3 ))}h(h , £a) d< 2 d<i . 

7. To estimate the double integral on the right-hand side of (16) we break 
it up into parts and use the following estimates of h(k , tf) . 


Lemma B. We have for l — I — I ^ 1 9 
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( 1 ) 

( 2 ) 


I HU , U) | g Ajt | 2 | E fa - l) b n* e _0+1)/ *‘; 




h(k , k) | ^ Qkt~ z N{ ! k |, n‘\ C mi ) 


where N(\k [, n~ il2 , «~ 1/2! ) is a polynomial with constant coefficients in the indicated 
arguments. 

Proof, 

! MM ,k)\ = f [ dydx- f j dy dx 

6 < V £ l + i (*) b + L (*)<»^6 

= ( f f l+LM - [ f ) d y dx 

= (— f°° e ~‘ ilx ~ ,xil ( e ti * LM - 1) dx. 

J-oo 


Hence 


Since 


we obtain 


! h(k, k) | ^ \t2\~ 1 \kL(x)\e~‘ lU dx. 

J— OO 


2 1 


| Ux) I 5 A k 1 2 I £ (“1 - D* » I * I 

J-l 


A(ii, fe)| ^ Z («< - 1) ,J « iJ f M J 

J-l J-00 


e _ “ 11 dx 


Next, we write 


j-i 


h(k, k) = (— ik) £ u"(x)v(x) dx 


with 

u"{x) = e" 1 * 1 -, •(*) = e"“ ,, (e' ,l,lW - 1 ). 


Integrating by parts twice, we get 
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whence 

I h(h ,60 | e-' xU { I L"(x) | + | L'(x) | + . | * P' - ’ I L(x) \ 

J— OQ 

+ eV '- 1 1 m | + | k II L'\x) I }dx 5 Q k tj\\ z | + z 5 )7V(| i, I, n- ! , r' 1 ) 

The lemma is proved. 

Now we write 

f C (T- \ U\)[f - ‘Kl++)\hdt i dh 

J— do J-r 


(17) 


= SI + H + II =Ji + 73 + Jj - 

|ill>0*"' l«L|SO*lt» 


On Zi we use Lemma 3 and Lemma B, (1) : 

|7i|£4»M 

» — no 4-59 \ /*■! / 


•te ( i <« r + • • • + k 


2J 


g Q* | z | IV !U-J) 2 e - w+1)/2 ‘, 

J-l 


On Ij we use Lemma 7 and Lemma B (2). Since | h I > QiW \ | $(1 + ^) | ^ 
e~ nQ \ and by Lemma 7, p{iin~\ UrT*) = e~ Qk so that |/(<i , fc) | g e~ nQt , | / — 
$(1 + *) | SJ e'" 0 * 


/a £ 0*z 2 // Ttf e~ nQ > Ndkl.n-*, r 1/5 ‘) dh dh . 

1**1 

Let t = n~\ /9 > 0, then it is evident that 

I I'll £ Q*z*. 

Similarly using Lemma 7 and Lemma B, (1) on Z 5 we see that 

I J. I £ 0* M • 

Therefore 

4r L It ^ ~ I ** l){/& > k) ■" <t>(h , b)(l + , iti))}h{t x , ti) du dil 

g Qt (l Z I + a 2 + 1 2 1 Tn~ i(k ~ v t n~"e ~™ lu ) . 
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8. Combining (13), (16), (17) we obtain 


(19) 


L ~ — ( II dw ~ B{u +b) ~ C{u + b ^ (k ~ v ) du | 


V£v+t>hL(x) 


£ Qk (r e + Tn H( *- 2) + 1 2 1 + z 5 + I * | E n u r Qwm> ) . 

Now we shall choose T and « suitably. I^et 

T = n a , e = n~^ t a > 0, (3 > 0. 

To make the right-hand side of (19) a constant depending on z only, we must 
have a g |(fc — 2) • j3 a. Then 

^ n -*i t -u+i>/« _ n (W~»/+»/« 

i-i i-i 

We must choose /9 < A/2, then 

2n“ l 'r tfM)/ “ g Ain™-" 1 * 1 . 

j-i 

To make the exponent as small as possible we choose j 3 = a, then 

| 2 I Tn- i(t - 2) £ n -»>r w+1), « g 4* I s | = A k |a 1 

j-i 


since a is to be as large as possible, we choose 

. (k- 2 (fc - l)l\ 
a- ao - g , 2(; + x) 

Then we obtain 

f° 1 ~ ( jj dw — B{u + b)— C(u + 6)n-*«-«^ 


■a- 


du\ 


( 20 ) 


#Su+t+M>) 


^ Qk( 1 + «*)• 


Let F*(u) be the distribution function of Y — L{X), and let 
Ft(u) = B(u) + C(u)tT‘ ( *- 2) 

Then we may write (20) as 


( 21 ) 


f 1 {F*(u + b)~ Fi(u + b)} du 


2* Q»(l + A 


By the definition of Fi{u) we see that the conditions in Lemma 8 are all satisfied 
with a certain constant D depending on k, P(x), and z for the M therein. Then 
choosing b to be the a in Lemma 8, we obtain from Lemma 8 and (21), 


( 22 ) 


Z>rs{ 3 jf - — da: - Tfj g &(1 + 2 s ) 
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where 


Now there exists A such that if TS > A, then 

/.rj 


„ f 1 -dOSJ , . , 

3 / = ax — it > 1, 

Jo x 2 


hence it follows from (22) that 

TS S max (A, Z) -1 Qt( 1 + z 2 )). 

Thus for another Q[ exceeding both A and the above <3* , we have 

TS g Qfe(l + z 2 ) 
and so finally, dropping the prime, 

(23) | F*(u) ~ Fi(u) | ^ Qt(l + z 2 )DT~ l = Q t (l + z)DrT«\ 

In particular, taking 6 to be z(l + n -1 z 2 ) - ’ = z', say: 

(24) Pr{Y - L(X) g z'} = B(z') + C(z')n"‘ (fc ' 2) + A*(l + z 2 )Dn~° 
where 

B(z') + C(z')n~ l< *~ 2) — the Taylor expansion with a remainder of 
j J (w(x, y) + y(x, y )) dy dx 

¥— L(»)S»' 

and D is an upper bound for 

| J3'(u) + C'(u)n- iik - 2) \. 

9. Let X = ri~\ and rewrite the z' + L 21 - i(x), l & 2 there as g(X) : 


9&) 

Then 


= z' (l + 


(cn — 1 ) 


1/2 


■ \X 


~~ 8 X + “~ 16— X3: + ) 


g{ 0) = 

m = 


ff"(0) - - 1 z'* 


g"\ 0) 


~ 1 _/_2 
4 

,«/2 


_ 3(04 - if 2 ., 
8 


z'i*. 



student’s statistic 


459 


Let p S 0, 5 ^ 0; w pg (x, y) = t w ( x t v) where w(x, y) is defined in section 
4 and we know that 


Let 

(26) 

Then 


— Cx*—4p*v+i/ 3 )/2(l^p a ) 


»<* »)- wf=7 e 

!•“ I*00) 

/ M W = / / W Pg (x, 2 /) dx. 

J QC QQ 


/ CO 

J>2 (x, fif(X)) dx 

00 

( 27 ) f'iA) = [_ (n"Ww» (*» ffQO) + 0(x))) dx 

/ P ',(X) = l" (g'"(\)w pg (x, g(\)) + 3 3 "(X) ff '(X)in Pi5+1 (x )S (X)) 

+ (/’(Xjuv.^fo ff(X))) dx 
Let 

4 “ 4M “ 75 JL *• 4 “ ■ £ * (z) | _ 

(28) 

f Pg — I x Wpq (x,z ) dx 
J— 00 

We have computed the following table of values of Z Pg : 



3 g 



o 


0 

0 

0 

0 

1 

-P$ (I+1) 


0 

0 

0 

2 

<$ <5) p 2 $ (s+2) 

2 P $ W+1) 

2$ (,) 

0 

0 

3 

— 3p$ (9+1) - P V 9+3) 

— 3$ <9> - 3p V 9 +2> 

— 6p4> <s+1) 


0 


Next, we find, from (25)-(28), 

/oo( 0 ) = f>; 

/«(0) = 4.»-i for 3 = 1; 

/ M ( 0 ) = z >l\ q 

(92) 

&W = - ~“7 ~ s'4, + 


a 4 — 1 _, T a ,04—1 _ /It 2 

^ PQ ~T ^ ^ * P»ff+1 


/"(o) = 3(a o'/i. - 3(ai ; 1),,! o-t„ 
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Now we can expand 

nr r“ raW 

J J w(x,y) dy dx = J ^ £ w(x,y) dy dx = /00(A) 


Write the Taylor’s series for /00(A) : 


/00(A) = MO) + /oo( 0 )A + ^>A’ + ^ X s + 


Substituting from (29), we get 

r< ( *> (ai _ Di/* 


f oe 

/ w(x,y) dy dx = 4 > — 

OQ •/— 00 


271 1 ' 2 


6 


p*'f> (1) 


(30) 


+ { -2'(4> C0) + P*$ m ) + + P V a> )} 


8 n 


+ i 3z '(~ 3 ^ U) ~ P 4>(>1 > " 3 Z ' 2 (-3p4- m - ,**«>) 

+ s ,J (-'3 P $ t8> ~ P V 6) )1 + ■ • • 

Further, we must obtain the beginning terms of y(x, y) as given in Lemma 3, 
for which purpose we refer to Hsu’s paper. We have, in fact 

m iu) = _ +(% _ ^?V +(2i - ™ + u±\± + . . . 

W 6n 1/2 + V 24 36/n + \120 72 + 2l6/n 3/2 + 

where 


S'. + 


- «« + 3 V'S^T ft + 3 5^2! w! + 

04—1 (<*4 — 1) 8/Z 

. j a*— 4a« , 

= («< — 3)/ 2 + 4 ^ ^ ' * ‘ 

Ui = £ ( {<i ^=1 + **) - 10£ (* + fef ) E ( h fek + w ) 

= (as — 10a»)<S + • • • 


To avoid the exhibition of very long expressions, let us separate the terms 
in ^(iti , if s) according to the powers of n~ m , and denote the terms of the power 

*~l/2 “1 "~3/3 

n n ,n by fa , fa ,\p 3 , respectively. 

Thus tff 1 = —iUi/6n m , and the corresponding 7(1, y) is 



student’s statistic 


461 


(31) 


Ti (x,y) ~ ^—p- 2 (a a wo3(x,y) + 3 s/ an — 1 w n (x,y) 


+ 3 w n (x,y) + 

on — l / 

where, as hereafter, the terms omitted will yield nothing in the long run, 
Now we have by (31) and (26), 

/* 00 i»o(X) 

/ / yi(x, y) dy dx 

*L_oo J— CO 


~ “ 6 n m + 3 Von - T/u(x, y) + 3 ^-_ 2 p / a (g t j/) + . - •) 

= “ 6 ^ (^/«(°) + 3 Vo* - l/«( 0 ) + 3 - ;-_ 2 p / 2 i(Q) + ■■.) 
(«a/oa( 0 ) + 3 V «4 - l/i'i( 0 ) + 3 p-^/^O) + ...) 

“ i2^/ 2 («a/o' 3 (0) + 3V«4 - l7u(0) + 3 ^-~- 2 p/n(0) + ••■) 

(32) — 6^1/2 (<X3-I<>2) — — z '( a t>J°3 + 3Vai - 1 In) 

+ f (* A + 3 vW^T A + 3 2LZ*? it) 

- z' 2 (*» ll + 3 Vo* - l In + 3 ij,)) 4 

= - aSs*" + + 3V^*< 2 >) 

+ 15& {*' [ a3($<3> + pV6,) + 6 V^Tp4- (8> + 6 

- z' 2 [« 3 ($ (4) + pV 6) ) + 6 V^ip* (t) + 6 p-T 2 - p $ (2) j) + . . . . 

Similarly, omitting the intermediate steps to save space, we have 

/ °° i* p (X) 1 

„ jL y ^ X> ^ dydx = 7 ^n “ 3 ^ <8) + 

<*» - nssT { 3 <°» - 3 >^"’ + 12 (75=1 •" 

+ 2a 2 p$ !7> 4- 12 a 3 \/cn — 14> <6) 1 4~ • ■ • . 
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*00 *ff(X) 

/ / V)(x, y) dy dx 

a ) — CO V— 00 

_JL / «. - 10«» . 5««^a» , $( 8)\ , 

n»' 2 V 120 ^72 ^ 216 / + 


Combining (30), (32)— (34) and simplifying, we obtain, as the first four terms 
of the asymptotic expansion of F(z): 

J J (w(x, y) + y(x, y )) dy dx = $ - (3z'4> (1) + 4> w ) 


V-LW£l' 


i /2 lz» *«+**» 


4n I 6 


4. s / / £? $(« + 2(a 4 - 1) - a» _ <*4 - 1 $ («»\ 
\ 3 2 2 / 

+ # » + f *“)} 


(35) 1 f at - 10a 3 t «) ■ <** $ «) , «i g(«) 

v T 24n 3/2 \ 5 3 9 

_(_ F 6«5 — 3o 3 — 9a 3 Q4 7« 3 (a4 — 1 — «£) — 2o t ^ (3) 

L 2 2 

I <33(^3 — 5a< + 7) (6) _ 03 .(7) 

2 3 . 

4- g' 2 j~ 9° ; 3 04 + 3«3 — 605 ^C2) 03(803 — 7 O4 + 7) 

+ g' 3 j~ -3g3(«4 - 1) $ C3) _ «S $ (S)J| + . . . _ 

10. In order to estimate the remainder C(z')n~®~ 2) 2 in the Taylor expansion 
we write, in accordance with Lemma 3, 

£ w *p(X) 

/ (w(x, y ) + y(x, yj) dy dx 

00 V — ao 

Z 00 i»0(M f k — 3 ■) 

m J m jw(*i V) + 2 X , '2(-l)' l+> ' 1 a^w^ix, y)> dy dx 


= /oo(X) + L X'2(-l Y 1+ ” a rin f riy ,(\) = Ejfi>(0) -! 

>“1 j»»0 J ' 


» A~2 A — 3 


+ /o ( r>*) + gX'Z(-l)-‘ + ' s a nri 

■ {If +«■■’»> ( i - iT-,). } 
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\ — "2 k— 5 \ li*"J 

= BV) + (pry,+ S«-1 r^tsrm 

- B(z') + (/S ■”(«>.) + £/!£?“"(«)) ■ 

Thus 

C(*0 - a h (f^\e\) + §/E?^(«o) 

Now we may write 

A) = f X(n(g M (6\))‘)w Pi (x, g(e\)) dx 

J—ao 

where, if we attach a weight s to (/ ,! (0X), the polynomial under the integral 
sign is isoharie of weight k — 2 — v in these ff !,) ’s, and the coefficient of each term 
is a constant multiple of a certain w Pq (x, g(6\)). Further, it is easily seen by 
induction that we have 

g w (6\) = Pi 4 j.(2)(1 + 

where Pi +2 , (z) is a polynomial of the three variables z, x, 0A which is of at most 
the (1 + 2s)th degree in z and of the (21 — l)st degree in x, and whose coef- 
ficients are all A, . 

Therefore, 

I fy'l-t r \^) I =! J 0*0 X | + X + ’ ’ • + | X f l l ) 

• (1 + | z | <t+2,ut - 2 - >),_ V M (x, g(e\))dx 

= LI Qk{] x \ + x * + •■■ + \ x Dd + I 2 | 1+2f *"' 2> ) e - dx 

S 0,(1 + | z I 2 *" 3 ) 

Thus 

(36) C(z') ^ 0,(1 + | z I 2 * -3 ) 

Lastly, an estimate of D is easy : 


j J A 0° A U+I»(*) 

F[(u) I = L— / / (v)(x, y) + y(x, y )) dy dx 

\ (I'll v— eO J— c* 

^ f (u’( x, u -f L(x)) + ] y(x, u + L(x)) |) dx g Q* . 

J— eo 


Collecting the results of (24), (36), (37) we obtain 

Pr{7 - L(X) ^ z'} = B(z') + A,((l + 1 z | 2 *^)n _1( *~ !) + (1 + z^n - * 0 ), 



(38) 
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Or, more simply, 

(39) | Pr{Y ~ L(X) g z') - B{z') j £ Q*(l + | 2 

where the first four terms of B(z') are given by (35). 


12. To return to F(z). We see that B(z') depends on the function L(x). 
Recalling section 3 we now write B m for the B corresponding to L m , with m = 
21 - 1 or 21 . 

Then by (4) the value of F{z) lies between 

PrfF - L lt -i(X) g 2 '} and Pr{F - Lu(X) £ 2 '}. 

From the asymptotic expansion just obtained for either of them, we see that 
the absolute value of their difference does not exceed 

| B t Uz') - B u {z>) | + Q*(l + | 2 

But 

Ln{x) = Lji_i(a;) — z% t (a, - l)‘n~‘x 2 ‘ ~ Lii-iix) — bux 3 ‘ say, 

hence 


/ «o — j (r) 

/ , „ K*» y) + y(*, y)\dy dx 

£ Qk b'n g Q* I 2 | »-' < Qk | 2 | n“"». 

Therefore 

\Pr{Y — Lu-i(X) £ z'} - Pr{F- L,,(X) £ z'j | g 
and so we obtain 

(40) F(z) = B{z') + -A»(l + | * rV- 

which is equivalent to (2) in the theorem stated in Bection 1. 

Thus the theorem will be proved if the assertions regarding the form of f(z) 
in (1) are shown to be true. 

For this purpose we denote, as before, the terms of the order n~’ 12 in \p(iti , its) 
and y(x, y) by 4*> , y> respectively. Since the term in which yields a w pg 
with the greatest q is 175 , we have for every w vt in y, the condition q £ 3v. 

/ y>{x, y)dydx to k — 3 — v terms, in which /,,( 0), / pa (0), 

to V — 00 

* • fpq 3 r5 (0) occur. In the integrand of /p^' 3- ’ ) ( 0), e.g., the coefficients of 
each «v„(x, 2 ) are polynomials in z and x of a total degree in z and x not exceeding 
that of (g'( 0))* -a- ’, i.e., 2(fc — 3 — v). Hence the expansion of y, will give 
rise to terms of the form 

z‘l‘ pt , q g 3r, s + t = 2(fc - 3 - v). 

Such a term will yield a term z'i (a+t \ which in turn yields the terms 4> w with 
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t ^ $ ■f 3 *f t?± 3y *f 2(ik - 3 - v) 3(ic - 3), 

Equality holds only when v = h - 3 and g = 3(1; - 3) . But when f = k - 3, 

the term in question is 

/o,3(U)(0) = /o, 34-10 = $ 1SH0) 

Next, we see that contains JJ t , * ■ *, . Since ( 0 ) is a poly- 

nomial of the [h - 3 - y)th degree in i, the expansion of 7 , will yield I®,,, 
■ * ■, I« 8 f . But J^ 8 *' — 0 ifp>fc — 3 — p, hence p £ fc - 3 - v. Thus 
in fa we need only take account of the terms (Vftf with p £ It - 3 - v. 

Now if j < k - 3 - v, in U, only a 3 , • • •, a 2 ( Jw-o occur. If k - 3 - v, 

in the coefficient of a term (iff (iff with p £ h - 3 - v the greatest index 
of a is 


2(k - 3 - v) + j - (k - 3 - f) = j+ fc * 3 - v i fc - 1 

since j i v + 2. Hence in the expansion of every 7 only on , • - ■ , aw occur, 
The proof of the theorem is completed 
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SOME IMPROVEMENTS IN SETTING LIMITS FOR THE EXPECTED 
NUMBER OF OBSERVATIONS REQUIRED BY A SEQUENTIAL 
PROBABILITY RATIO TEST 

By Abraham Wald 
Columbia University 

Summary. Upper and lower limits for the expected number n of observations 
required by a sequential probability ratio test have been derived in a previous 
publication [1]. The limits given there, however, are far apart and of little 
practical value when the expected value of a single term z in the cumulative 
sum computed at each stage of the sequential test is near zero. In this paper 
upper and lower limits for the expected value of n are derived which will, in 
general, be close to each other when the expected value of z is in the neighbor- 
hood of zero. These limits are expressed in terms of limits for the expected 
values of certain functions of the cumulative sum Z n at the termination of the 
sequential test. 

In section 7 a general method is given for determining limits for the expected 
value of any function of 


1. Introduction. Let x be a random variable and let /(a;, 0) be the elementary 
probability law of x involving an unknown parameter 0. Let H a denote the 
hypothesis that $ - 6 0 , and Hi the hypothesis that Q = 6i , where 0 a and 0 X 
are given specified values. The sequential probability ratio test for testing H 0 
against Hi , as defined in [1], is given as follows: Put 


( 1 . 1 ) 


z. 


= log 


/(s. , fll) 
/(*»■ , ft) 


where x , denotes the i-th observation on x. Two constants, a and b are chosen 
where a > 0 and b < 0. At each stage of the experiment, at the ra-th trial for 
each positive integral value m, the cumulative sum 


(U2) Z m = zi+ ■ • • + z m 

is computed. Experimentation is continued as long as b < Z m < a. The first 
time that Z m does not lie between b and a, experimentation is terminated. The 
hypothesis Hi is accepted if Z m ^ o, and II B is accepted if Z m g b. 

Let n denote the smallest value of m for which Z m does not lie between b and o. 
Then n is the number of observations required by the sequential test. The 
expected value of n is a function of the true parameter value 6 and is denoted 
by Et(n) . 

Upper and lower limits for Ei(n) have been derived in section 4 of [1]. These 
limits, however, are of little practical value when the expected value of 
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(1.3) 


z = log 


fix, fli) 
fix, Oo) 


is in. the neighborhood of zero, for they converge to + oo and — oo , respectively, 
as the expected value of z approaches zero. It can be shown that the expected 
value of z is negative when 6 = 8 0 , and positive when 8 — 8 X 1 Thus, if the 
expected value of z is a continuous function of 8, there will be a value 9' between 
0 0 and di such that the expected value of z is zero when 8 = 8'. Hence, the 
limits for Esin), as given m [1], are of no practical value when 8 is near 8' 

The purpose of this paper is to derive upper and lower limits for Esin ) which 
will be, in general, close to each other when 8 is in the neighborhood of 6' Thus, 
it will generally be possible to obtain close limits for Esin) over the whole range 
of 8, if the limits given here are used for values m a certain small interval con- 
taining 8', and the limits given in [1] are used when 8 is outside this interval. 


2. Notation. We shall use the following notations throughout the paper. 
For any random variable u, the symbol E g {u) will denote the expected value of 
u when 6 is the true value of the parameter. The conditional expected value of 
u, under the restriction that some relationship R is fulfilled will be denoted 
by Esiu | R). The symbol P(R, \ 8) will denote the probability that the rela- 
tionship R holds when 6 is true. 

The cumulative distribution function of z will be denoted by F(z, 8) when 8 
is the true value of the parameter. The moment generating function of z, 
when 6 is true, will be denoted by (pit, 0), i e 

(21) tpit, 8) = f e“dFiz,8). 

3. Assumptions concerning the family of distribution functions F(z, d). In 
this section we shall formulate two assumptions concerning F(z, 8) which will 
then be used to prove various lemmas and theorems Since we are interested 
in values of 8 near 8', we shall restrict the domain of 6 to a finite closed interval 
I containing 8’ in its interior. It will be understood throughout the paper that 
any statements concerning 9 refer to the domain I, even if this is not explicitly 
stated. 

Assumption 1. The moment generating function (pit, 8) exists for any point 
t in the complex plane and any value 8, and is a continuous function of 8. 

Assumption 2. There eists a positive 8 such that Pie 1 > 1 + 5 1 6) and Fie 1 < 1 
— S | 0) have positive lower hounds with respect to 9. 


4. Proof that (pit , 8) is continuous in t and 6 jointly and that all moments of z 
are continuous functions of 8 . 2 In this section we shall prove the following 
theorem: 


1 This follows easily from Lemma 1 in [1], p 156 

1 The original proof of the author was somewhat lengthy The present proof was sug- 
gested by T E Harris 
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Theorem 4.1. It follows from Assumption 1 that <p(t, 6) is continuous in t and 
8 jointly and all moments of z are continuous functions of d. 

Proof: First we show that <p{t, 0) is a bounded function of i and 6 in the 
domain 1 1 1 ^ k , for any finite positive value 2 0 . -Clearly, 

(4.1) 0 =g | <p(t, 0) | ^ 2[*(*b , 6) + <p(-U> , 6)] 

for all values t for which | i \ g t a . The boundedness of ip(t 0 , 0 ) and <p{~U , 8) 
follows from Assumption 1, Hence <p(t, 8) is a bounded function of 0 and t 
over any bounded 2-domain. 

Let ( 2,„ , 6 m ] (m = 1, 2, ■ - , ad inf.) be a sequence of pairs converging to 
the pair (2', 6'). We have 

(4 2) , 0 m ) ~ 6') = [<p(t m , 9 m ) - B m )} + [<p(t', 0 m ) - <p(t\ fi')]. 

The second expression in brackets converges to zero by continuity in 0. Thus 
the first part of Theorem 4.1 is proved if we show that 

(4.3) lim , 6 m ) — 0,„)] = 0. 


It follows from Assumption 1 that for any given 0, < pit, 0 ) is an analytic func- 
tion with no singularities in any finite 2-domain. Hence we can expand <pft m , 
6 m ) in a Taylor series around 2 = 2', i.e. 


(4.4) 


W( £ m , 2*m) ~ 6m ) 


f I /MiJ 

£iki\ dt* 



i') K . 


Let r be a given positive value. Because of the boundedness of ip(t, 0 ) in any 
finite 2-domain, there exists a constant M such that | <p[t, 0) \ < M for all 0 
and for all 2 in the domain | 2 — 2' | g r. From the Cauchy integral formula 
for an analytic function it follows that 


(4 5) 


1 

d k (p{t, 6 m ) 


k\ 

di k 

t-t J 


From (4.4) and (4.5) we obtain 


(4.6) 


6m) - B m )\ SME 

1-1 r* 


Equation (4 3) is an immediate consequence of (4.6), This proves the first 
half of Theorem 4 1. 

Let C be a circle in the complex 2-plane with finite radius and center at the 
origin. According to the Cauchy integral formula we have 


(4 7) 


_L f -iMl 

2 riJa t k+1 


dt 


l d k <p(t,e) 

fcl dt k 


i-o 


1_ 

k\ 


E,iz k ). 


Since <p{t, 6) is continuous in 2 and 0 jointly, the integral on the left hand side of 
(4.7) is a continuous function of 0. This proves the second half of Theorem 4 . 1 . 
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5. Some lemmas. In this section we shall prove several lemmas which will 
then be used to derive the results contained in sections 6 and 8. 

Lemma 5.1. It follows from. assumptions 1 and 2 that for any given 0 the equa- 
tion in t 

(5.1) <p{t t 0) = 1 

has exactly two real roots, one of which is zero . The other real root is different from 
zero if E$(z) ^ 0, If Eg(z) = 0, both roots are equal to zero, i.e., zero is a double 
root of (5 1). 

This lemma is essentially the same as Lemma 2 in [2] and the proof is therefore 
omitted. 8 

Let h{6) denote the non-zero root of (5 1), if E B (z) 0. If E s (_z) = 0, we 
put h{8) = 0. 

In what follows the variable t will be restricted to real values, unless the 
contrary is explicitly stated. 

Lemma 5.2. It follows from assumptions 1 and 2 that h(0) is a continuous 
function of 6. 

Proof: It follows from assumption 2 that 

(5.2) lim <p{t, 6) = + 00 

t-*±« 

uniformly in 6. Hence, since by definition 

<p[h(d), 0] = 1 

identically in 8, h(6) must be a bounded function of 0. 

Let (0 m j be a sequence of parameter values which converges to 0*. From 
Theorem 4.1 it follows that 

(5.3) lim [<p(«, 8 m ) - <p(t, 0*)] = 0 

m— *oo 

uniformly in t over any finite interval. Since h(6) is bounded, we obtain from 

(5.3) 

(5.4) lim WMdJ, 0j - v [h{8 m ), 0*]) = 0. 

771— » 00 

Since <p[h(8 m ), 0 m ] = 1, it follows from (5 4) that 

lim <p[h{d m ), 0*] = 1. 

m—*oQ 

It follows from assumption 1 that for any limit point h of the bounded se- 
quence {h(0 m )j (m = 1, 2, * • ■, ad inf.) we have 


3 Condition IV of Lemma 2 in [2] is not postulated here, since the validity of this con- 
dition is implied by assumption 1 Condition IV could have been omitted also in [2], 
since it follows from condition III. 



470 


ABRAHAM WALD 


(5.5) <p{K, 6*) = 1 

If h(6 *) = 0, then equation tp(t, 0*) = 1 has the only root t = 0. Conse- 
quently, all limit points of \h(8 m ) ) must be equal to zero, that is 

(5.6) Iim h(6 m ) =0 if h(6*) = 0- 

m— *oo 

Now let us assume that h(0*) ^ 0. Since the second derivative of <p(t, o) 
with respect to t is positive, it can be seen that <p(t, 8) < 1 for values t in the open 
interval (0, h(0)), and <p(t, 6) > 1 for any t outside the closed interval [0, h(8)]. 
Hence, <p(t, 8) < 1 implies that | h(0) j > 1 1 1 and h(6) and i have the same 
sign. Now let k , be a value in the open interval (0, h(8*)). Then we have 

(5.7) , d*) < 1 
It follows from assumption 1 that 

(5-8) <p{t o , 6 m ) < 1 

for sufficiently large m, Hence h(d m ) and k have the same sign and 

(5.9) | h(6 m ) | > | to | 

Inequality (5.9) implies that zero cannot be a limit point of the sequence 
1^(0) ■ Since tp{t, 8*) — 1 has only the roots t — 0 and t = h(0*), it follows 
from (6.5) that the sequence {h(0 m ) } cannot have a limit point different from 
h(0 *). Thus, 

(5.10) lim h(fi m ) = h(8*) 

m-~+ oo 

and Lemma 5.2 is proved. 

Lemma 5.3. It follows from assumption 1 that for any given t, E ((e 1 ** 1 ) is 
a bounded function of 8. 

Proof: We have 

(5.11) E„(e M ) g E s (e“ + e~“) = v (t, 9) + ?(-*, 8 ) 

It follows from assumption 1 that <p(t, 6 ) and 9) are bounded functions 
of 8. Hence Lemma 5.3 is proved. 

Lemma 5.4. Let 8' be a value of 8 such that Ev(z) = 0, but E e (z) 0 for all 
8 O' in an open interval containing 8' . It follows from assumptions 1 and 2 
that 


(5.12) 

Proof: We have 



(5.13) e m ‘ = 14- h(8)z + £ + EMI* £e uhW ‘ 

% 6 


where 0 g u g 1, Hence 
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(5.14) E s (e m ') = 1 + h(9)E a (z) + E,(/) + F,(zV ww ')- 

I o 

Since Ee(e m ‘) = 1, we obtain from (5.14) 

(5.15) h(8)E„(z) + M3? Eetf) + E e (ze um ‘) = 0. 

& 0 

We shall consider only values 8 for which h(8 ) ^ 0. For such values of 6, 
also E s (z) 7* 0. Dividing (5.16) by h(d)E e (z), we obtain 

(5 ’ 16) 1 + 2A0T) \_ E>(z) + E ^ eUmiS >] = °- 

Let k be an upper bound of | h{8) | with respect to 8. Then for a suitably 
chosen constant C we have 

(5.17) IzV^'l < Ce" 0 ' 1 

From this and Lemma 5.3 it follows that E t (z > e M>u ) is a bounded function 
of 8. 

Because of the continuity of h{8) we have 


(5.18) lim h(0) = 0. 

Lemma 5.4 follows from (5.16), (5.18), the boundedness of Et{z l e uM) ) and 
the fact that F ( (z 2 ) is a continuous function of 6 and Ef(z 2 ) >0. 

Lemma 5.5, From assumptions 1 and 2 it follows that for any given t, Et(e *) 
exists and is a bounded function of 0. 

Proof: It is sufficient to show that E s (e 12 ') is a bounded function of 8 for 
any t, since 

(5.19) ^ e ,z " + e _,z " 

Clearly, e z " lies between e H+I "‘ and e a,+ ‘" 1 Hence Lemma 5.5 is proved if 
we show that F«(e‘" ( ) is a bounded function of 8. 

It follows from Assumption 2 that there exists a positive integer k and a 
positive constant g such that 

(5.20) P(| zi + • ■ • + | S a - b | 8) ^ g 

for all 6. For any positive integer m and for any real values Ai < Xj we have 


(5.21) 

and 


P[{m — l)k < n g mk 1 0] ^ 
P[(m - l)fc < n 1 0] ~ g 


P[{m — l)k < n mk & \i g z n < Xa [ 0] 
P[(m - 1 )k < ft 1 01 


(m = 1 , 2 , • • • , ad inf.) 


(5.22) 


g 1 — [1 — P(A i ^ z < *>* I *)]*• 
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Hence 


(5.23) 


P[(m — 1)& < n S mk & Xi £ z„ < X 2 1 0] 
P[(m - 1 )k <n ^ mk\ 8] 


5 1 ~ ~ P(X i g £ < X, 1 9? 

9 


Multiplying (5.23) by P[(m — l)k < n £ mk j 6] and summing with respect 
to m we obtain 

1 - [1 - P (\ i ^ z <X 4 |0)J* 


(5.24) 


P(X i g z„ < \i 1 8) g 


9 


From (5.24) it follows readily that 


(5.25) 


P(X i g Z„ < X 8 \ 9) 
P(X i §2< X 2 |0) 


is a bounded function of Xi , X 2 and 6. Let A be an upper bound of the ratio 

(5.25) . Then 

(5.26) E s (e“ n ) g. AP,(e") = A v (t, 8). 


Because of Assumption 1, <p(t, 6 ) is a bounded function of 0. Hence also 
E»(e‘‘") is bounded and Lemma 5.5 is proved. 


6. The limiting value of Ei(n) when 8 approaches a value O' for which 
Hj-(z) = 0. In this section we shall prove the following theorem: 

Theorem 6.1. Let 8 ' be a value of 0 such that Ee>{z) = 0, but H«(z) 5^ 0 for 
all 6 7 * O' in an o-pen interval containing O'. If assumptions 1 and 2 hold, we have 

(6.1) lim j^Es(n) - J = 0. 

Proof: Consider the Taylor expansion 

(6.2) e mz * = 1 + h( 8 )Z n + MU" Z\ + Z 3 n e uwz " 

2 o 

where 0 g X ^ 1. It was shown in [2] (p. 286) that 

(6.3) E# m Zn = 1. 

Hence, taking expected values on both sides of (6.2), we obtain 

(6.4) h(fi)E>{Z n ) + E t (Z\) + ^M 3 E 6 {Z\e WB)z *) = 0. 

We consider only values of 0 for which E e (z) ^ 0. For such values, also 
h{ 6 ) j* 0. Thus, we can divide both sides of (6.4) by h{ 8 )Ea{z)- We then 
obtain 



IMPEOVEMENTS IN SETTING LIMITS 


473 


(6.5) 


Ee(Z n ) , h{6) 


+ 


E,(z) ' 2 E,{z) 

It was shown in [1] (p. 142) that 


[e s ZI + E 9 (Z\e h{6)z *)^ = 0. 


( 6 . 6 ) E 6 {n) = ^ . 

E e (z) 

Hence 

(6.7) E e (n) + M- ) \e 6 {Z\) + E 6 (Z 3 n e U(>)z ^ = 0. 

Let to be an upper bound of | h(6) j. Then for a properly chosen constant C 
we have 


(6.8) \Z*J mu \ <> C'e 1 ' 02 ’ 1 ' 

From this and Lemma 5.5 it follows that E e (Z\e hm Zn ) is a bounded function 
of 8. Since lim h(6 ) = 0 and Eg(Z 2 n ) has a positive lower bound, Theorem 

6.1 follows from 6.7, Lemma 5.4 and Theorem 4.1. 

If lim EtZ\ = EfZi , Theorem 6.1 gives* 

»-> 9 ' 

la n\ T? _ E 6 >(Z\) 

(6.9) B,,(n ) = . 

Limits for E 6 > (n) can be obtained by computing limits for E„{Z\). In the 
next section We shall give a general method for obtaining limits for E 9 [\li{Z„)], 
where \p{. Z„) is any function of Z„ . 


7. Determination of lower and upper limits for the expected value of any 
function of Z„. Let fi(Z„) be a function of Z n . Limits for E e [\p{Z n )] may be 
determined as follows: First we determine limits for E d [ip(Z n ) \ Z n ^ a\. Let r 
be a positive variable. Clearly, for any given value r we have 

(7 1) E-o(iiZ n ) | Z n _ i = a — r and Z n ^ a] = E e [\[i(a — r + z) | z ^ 

From (7.1) we obtain the limits 

g.l.b. E $ [ip(a — r + z) | z ^ r] ^ E t [\p{Z^) | Z„ S a] 

0<r<o — b 

g l.u.b. E e [ip(a — r + z) \ z S r\. 

0 <r<a— b 

Limits for E 9 [<p{Z n ) \Z n ^ b] can be obtained in a similar way. Again, let 
r be a positive variable. For any value of r we have 

(7.3) EoMiZ,,) | Z„ ^ b and Z n _ x = b + r] = FehHb + r + z) \ z ^ -r 
Hence we obtain the limits 


4 The validity of (6 9) was shown by the author [3] using an entirely different method. 
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g.I.b. E e [\p(J) + r + z) I 2 ^ — r] £ Eskp(Z„) | Z n ^ b\ 

0 <r<a— & 

(7 ' 4) g l.u.b. E 6 {Hb + r + z)\z^ - r ). 

D <r<a— 1» 

Since 

(7.5) F^Z*)] = P(Z n 5= a)^[^(Z n ) | Z„ £ o] + P(Z„ g b)E,[+(Z n ) | Z n g b], 

a lower (upper) limit for E e [>p(Z n )] can be obtained, by replacing the condi- 
tional expected values on the right hand side of (7 5) by their lower (upper) 
limits given in (7,2) and (7.4). 

8. Limits for E 6 (n) when h(6) is near but unequal to zero. Let O' be a value 
of 6 for which h(d’) = 0. In this section we shall derive limits for E 0 (n) which 
will generally be close to each other for values 6 in a small neighborhood of 6'. 
From equation (6.7) we obtain 

(8.1) E e (n) = \e*Z\ + ^ E e (Z a n e U(#,z ")j 

where 0 g X ^1 Thus, limits for Es(n) can be obtained by deriving limits 
for EiZ\ and E s {Z\£ mZn ). Limits for E S Z\ can be obtained by using the 
method described in section 7. 

If 6 is near O', any crude limits for EsiZ^e^ z ") will serve the purpose, since, 

as has been shown in section 6, E e (Z 3 n e uw z ") is bounded and lim h(0) = 0. 

6—*0 i 

Limits for Es{Z\e m Zn ) can be obtained as follows: For simplicity, let us 
assume that h(0) > 0. Then 

(8.2) Z\ ^ Z\e xmz ” g Z\e mz " (fc(0) > 0) 

Thus, to determine limits for Es(Z\e' hW z "), it is sufficient to determine a lower 
limit for £7j(Z 4 n ) and an upper limit for Ej(Z 3 n e hW z "). The latter limits may be 
derived by using the method given in section 7. 

If h(9) < 0, we have 

(8 3) Z\ g ZU M(>) z ” ^ Z\e m2n 

and a similar procedure will yield the desired limits for E t {Z 3 n i' hV) z ") . 

It should be emphasized that the limits of Eo(ri), as given in this section, 
can be expected to be close only if h(0) is near zero. For values of 6 for which 
h(d) is not near zero, the limits of E a {ri) given in [1] can be used. 
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THE EFFICIENCY OF THE MEAN MOVING RANGE 

By Paul G. Hoel 
University of California at Los Angeles 

Summary. In studying the variation of a variable subject to erratic trend 
effects, it is customary to employ as a measure of variation a statistic that 
eliminates most of such effects It is shown in this paper that the statistic 
w = Ya 1 1 a\+i — x, | \Uir/2(n — 1 ) is nearly as efficient as the statistic 
5 = Yf 1 (* t +i — a;,) 2 / (ft — 1 ) that is customarily employed. The asymptotic 
variance of w is obtained by integration techniques, the proof of the asymptotic 
normality of w is based upon a theorem of S. Bernstein on the asymptotic dis- 
tribution of sums of dependent variables. The method of proof is sufficiently 
general to prove the asymptotic normality of w, and of 5 2 , for x having a dis- 
tribution for which the third absolute moment exists. 

1. Introduction. Let xi, X 2 , • ■ - ,x» denote a random sample of size n from 
a population with a continuous distribution function fix) . If a measure of the 
variability of x is desired, it is customary to select the familiar statistic 

Y (x, - x ) 2 

n\ J _ «=i 


or its positive square root s, as an estimate of the corresponding theoretical 
measure of variability. 

If, however, it is known that the variable x is subject to trend effects and that 
f(x) represents the distribution of x without such effects, then s 2 will not serve 
as a satisfactory measure of variability about the trend. In order to eliminate 
the influ ence of trends, it is helpful to employ statistics that capitalize on the 
time order relationships of the observations. There are several statistics of 
this type available, although most of them make no pretense of completely 
eliminating trend effects, even if the trend is linear. 

Perhaps the best known among statistics of the desired type is the mean 
square successive difference, 

2 (aiv+i - x,) 2 

(2) S 2 = — 

This measure of variation has been studied extensively in recent years. Among 
the results of these investigations is a determination [1] of the efficiency of 
i/2 as an estimate of a for a normally distributed variable when no trend exists. 

A closely related measure of variation that is not so well known is the mean 
moving range of successive pairs of observations, 
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n— 1 

2 I *.-+i - x < I 

•-I 


Although w appears [1] to have been used by ballisticians, very little seems to 
be known concerning the relative merits of 5 2 and w. Since w is considerably 
easier to calculate than 5 2 , it would be preferred to 5 1 for applications in which 
computational advantages are important. However, one would hardly allow 
such advantages to dominate a choice unless S 1 and w were about equally efficient 
as estimates of variation. 

The purpose of this paper is to determine the efficiency of w and to study 
efficiency properties of generalizations of w. 

2. Definition of efficiency. The definition that will be used in this paper 
[2] may be stated in the following manner. Let 6 be a parameter, or a function 
of parameters, of the distribution function f(x). Let T be a statistic for which 
there exists a number p such that 

t = \/n(T — 9) 

is asymptotically normally distributed with zero mean and variance p. Let 
T be any other statistic for which there exists a number p! such that 

a = Vn (T' - 9) 

is asymptotically normally distributed with zero mean and variance p Then 
T is said to be an efficient estimate of 6 provided that p < p for all possible 
choices of T', and the efficiency of any particular T' is defined to be 



In order to determine the efficiency of a statistic, it is therefore necessary 
to first demonstrate its asymptotic normal distribution and then calculate its 
asymptotio variance. This order of procedure will be reversed in the following 
determination of the efficiency of w. 

3. Variance of w. Let x be normally distributed with zero mean and unit 
variance. Then the mean of w, where w is given by (3), may be evaluated as 
follows: 

E(w) = E | x t - Xl | 

= f / I St — 1 1 ***** dxidxi 

= 2 / £/ (sz — dxi + j (xi — a^)e _!l i ,2) dxi^ dx 2 
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= h £ dxi ~ Xi r r<i ‘ /2) ^ + 2e_(i!,2) ] ** 

= ^ £ e_<l|/S) - 2 [^ f e_<I?/2> *1 + e- (l » /4) ] dx 2 . 

If integration by parts is performed on the first integral with 

u = f e -(l 5 /2) dx i and dr = x 2 e~ (l s /2> dx 2 , 

the uv term will vanish at both limits and E(w') will reduce to 

(5) JS(w) = - fV^dx 2 = 

TV J-oo V7T 

This result could have been obtained more easily by other methods, but some 
of the integrals involved will be needed later. 

For the purpose of computing the second moment of w, it is convenient 
to separate the independent and dependent product terms of w 2 . Since there 
are 2 (n — 2) of the latter, 2?(w 2 ) may be expressed in the form 

(n — l) 2 F(w 2 ) — (n — 1)2? | x 2 — xi | 2 + 2(n — 2)2? | x 2 — x x || x 8 — x a | 

+ (» - 2 !)(« - 3 )E * 1 x 2 — Xi |. 
But 

E | x 2 - x 2 1 2 = E(x 2 - xO 2 = E(xl) + E(xl) = 2. 
Consequently, because of (5), 

(n — 1) 2 2?(w 2 ) = 2 (n — 2 )E I x 2 — x 2 1| x 3 — x 2 | + 2(n — 1) 

(6) 

+ 4 (n - 2)(n - 3)/ir. 

Now consider the evaluation of the product term 
E | x 2 — Xi || x 3 — x 2 1 = (2ir) -1 J j J \ Xo — Xi\\ Xi — Xi |e _1(l i +l2+I ’ ) dxidx 2 dx s . 

By means of the expressions that were used to give (5), this triple integral may 
be reduced in the following manner: 

E | x 2 — Xi || x 3 — x 2 j = (2x) Tl J J \x t — xi |e _iCl s +I > ) 

• 2 jjr 2 J e~ (l i /2) dxi + dxi dxo 
— (2ir) _1 J e~ l ** /2) ■ 4 j^x 2 jf dxi + e -(l a ,2) j dx 2 

= 4(2x)"‘ £ (£ dx)) 

+ 2x 2 e _Cl ! /2> J dxi + e~£ dx t . 
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These three integrals, without their constant factors, will be denoted by I 
h , and I z , respectively, h may be evaluated by integrating by parts with 

u = X 2 (jf fi~ ( b 1/2) and dv - x 2 c~H n) dx 2 . 

The uv term will vanish at both limits, consequently 


(7) 


/. - £«-■*» + (fe-i-fe)’],,, 

” 2 L ** e "' ! l ’ *» + £ (£ 


« * w rfa: 7 dx * • 

The first of these two integrals may be evaluated in the same manner as the 
rst integral preceding (5). The second integral may be evaluated by making 
the change of variable b 

» = J 2 e _<I ? /s> dxy . 

Jo 

As a result of such manipulations, 

/ = Vjfij , ir\/ 2t 

' 3^6' 

It will be observed that £ is the same as the first integral of (7) and that h 
is available in tables ; hence 


E | x 2 — x x 1 1 x 3 — x 2 1 


( 8 ) 


= 4(2ir)-« + Vfr + V^rj ^2^ 

If (8) is substituted in (6), E{ w 2 ) will reduce to 
(9) £(w 2 ) =2(^2) H VS] 2 4(»-2)(»-3) 

<”-D 2 L t J + ■ 

Since 4 = E( w 2 ) - £ 2 (w), (9) and (5) will yield the following desired variance 


( 10 ) 


2 

— 


(n - l) 2 


[( 4 + ?vi-)„ + (^_ 1 )], 


) . * N t x be normi ‘“ y dis “>uted with mean m and 
' T 1 i meari ot w “= Siven by (5) will be multiplied by a and 
the vanance „f w as given by (10) will be multiplied by consequently a = 

l tinlL “ “ U “ b,aSed “ Kn “ te ° f ' * section ft will 


t' = V n(z — a) 
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possesses an asymptotic normal distribution From (10) and section 2, it there- 
fore follows that the asymptotic variance, fi' 1 , that is needed to determine the 
efficiency of z is given by 




2\/3 - 6^ 


a + ^ A = ^ + V 3-3. 

TT / 3 


Now it is known that for x normally distributed s, as defined by (1), is an 
efficient estimate of a with n 1 = I, consequently, because of (4), the efficiency 
of z as an estimate of a is given by 


(ID 



.605. 


In [1] it was shown that for x normally distributed & l /2 was an unbiased 
estimate of a and, assuming the normality of its asymptotic distribution, 
that the efficiency of 5 2 /2 as an estimate of a was 2/3. Thus, z = ws/V/ 2 
possesses very nearly the same efficiency as a measure of variation of a normal 
variable as b 1 / 2 does. 


5. Asymptotic distribution of mean moving ranges. Although the efficiency 
obtained in the preceding section requires for its validity merely a demonstra- 
tion that for x normally distributed w possesses an asymptotic normal dis- 
tribution, it will be shown in this section that general mean moving ranges of 
a continuous variable x possess asymptotic normal distributions provided only 
that x possesses a third absolute moment. 

Let r, denote the range of the observations from x, to x,+i_i . Then the 
variable 

(12) w = ^ + ^+.:: + ^ 

n — k 4-1 

will represent a generalized mean moving range, of which w will be a special 
case when k = 2. 

A proof of the asymptotic property of W can be constructed as an applica- 
tion of a general theorem of S. Bernstein [3]. Since his theorem is long and 
involves much explanation of notation, a simplified version of it that is sufficient 
to cover this application, and indeed many similar applications, will he given. 

Let y x , y% , • ■, denote m variables for which the third absolute moments 
are bounded and let 

Sm = Vl + yt + + Vm . 

Then Bernstein’s theorem implies that if there exist constants Ci , cj , ci , and c« 
such that 

2 

Ci m < a- 3in < cjm, 


(a) 

and 
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(b) 

then 


yi and y, Hj are independently distributed for 
g > c 3 m c> , a < 


S m - E{S m ) 

possesses an asymptotic normal distribution with zero moan and unit variance. 

Consider the application of tins theorem to R = (n ~ Jc + 1)W. The vari- 
ance of R may bo expressed in compact form by means of the techniques of 
section 3. Since r» is the range of k consecutive observations, it is clear that 

B(r i r, +ff ) = E l (n) 


if g > k. Furthermore, for subscripts for which it is defined, JS(r,r< +J ) will 
be independent of i. These two properties may be used to collect terms in the 
expansion of B(R 2 ) to give 

BCR 2 ) = (»-* + 1 )E(rl) + 2 2 (« - k ~ i)E(nr i+t ) 

»»0 

+ (» - 2k + l)(n - 2k + 2 )E\n). 


Consequently, 
(13) *% 


k - 2 


(n — k + 1 )E(r\) + 2 £ (n - k - i)S(nr 1+i ) 


+ [»(1 - 2k) + (k - 1)(3 * - l)]B 2 (rx). 


From the definition of the correlation coefficient and the fact that a correlation 
coefficient cannot exceed one, it follows that 


E{nr ni ) < E{ri)E(r 2 ^i) + <r rt ov J+l 
< E\n) + a 2 , . 

If this inequality is applied to (13), 

<r* < (ft - k + 1 )E(r\) + (k- l)(2n -3 k + 2 )[S 2 (n) + <r r 2 ] + [n( 1 - 2k) 

+ (k ~ l)(3fc - l)]B 2 (n) 

< (» - k + l)[B(r?) - B 2 (r0] -t- (k - l)(2n - 3fc + 2)<r r 2 

< [n(2k - 1) - (k - l)(3fc - l)]<Tri 

< 2/c<r r J(n - k + 1). 

Thus, for a fixed k the right inequality in (a) of Bernstein’s modified theorem 
is satisfied. 

For the purpose of demonstrating that the left inequality in (a) is also satis- 
fied, consider the following application of Schwarz’s inequality. Let 
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(14) G(x P ) ■ ■ ■ , Xk) — J • • ■ J" fpf (xu-h i) ■ ■ ■ fi%k+p— i) dxk+i • • ■ dxh+p—i > 

where fix) denotes the distribution function of the variable x and the range of 
integration in this and subsequent integrals is from — oo to «> , Since r v and 
f are continuous non-negative functions, this integral is a positive function of 
the indicated variables. Then, denoting G(x p , ■ ■ - , x k ) by G, it follows from 
Schwarz’s inequality that 


(15) 


= [/ 
-[/ 
*/• 


• j nfix,) • • • f(x k ) dx i - • • dx k J 
■ J Infix,) •■■f(x k )G} i Infix,) ■ 
J nfix i) • • ■ f(x k )Gdx i • • ■ dx k J 


fix k )G 1 } i dx, ■ ■ • dx k 
f nf(x,)---fix k )G~ l 


■J 


dx i • • • dx k . 

The two integrals of this inequality will be denoted by l a and In , respectively. 
If the value of G given by (14) is substituted in , it will be observed that 

( 16 ) h = / • * • / nr P fixi) • • • fixk+p-i) dx i - • • dx k+P ~ i . 

Now Ip may be written in the form 


u = / • • ■ / /Oj.) • • ■ f(n)G 1 J • • • j nfixi) ■ ■ • fixp-i) dx i • • ■ 






Since the a;, possess the same distribution function and n is the range of the 
variables from x, to x k , the integral in brackets ib equivalent to the integral 
defining G in (14) ; hence 

(17) I p = J ■ • • J f{x p ) • • • f{x k )G~ l G dx p ■ ' • dx k = 1 . 

If (16) and (17) are applied to inequality (15), they will yield the inequality 

[ / ‘ ” / Tl ^ x ^ ' ‘ ' dxi ’" dxp -\ 

</•"/ nr v fix\) • • • /(xjt+ p -i) dx i ■ • • dx k+P -i . 

In statistical language, this inequality states that 

E\n) < Einr„), 


or, what is equivalent, that 
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( 18 ) 


E\rd < E(tjj). 


If (18) is applied to (13), 

cl > (n - k + 1 )E(r\) + (k- 1)(2 n - 3k + 2 )E\n) + I»(l - 2k) 

+ (.k- 1)(3 k - 1 )}E\r x ) 

> (n - k + l)[B(r?) - E\r x ) ] 

> c T \{n — k + 1)- 

Thus, for a fixed k the left inequality in (a) of the theorem is also satisfied, and 
it merely remains to be shown that condition (b) is satisfied. 

For k fixed, r, and r, +c will be independently distributed provided that g >k. 
But if c a > k, then ci(n — k +1) 0< > k for 0 < c 4 < £ because n — k + 1 > 1; 
consequently r, and r, + „ will be independently distributed for g > c 3 (n — k+ l) e< , 
where 0 < Ci < Thus, conditions (a) and (b) are both satisfied by R, Since 
R = (n — k + 1)1F, it therefore follows that 


(19) 


W - E(W) 

aw 


possesses an asymptotic normal distribution with zero mean and unit variance 
provided only that x possesses a continuous distribution function for which the 
third absolute moment exists. The existence of the third absolute moment for 
x insures the existence of the same moment for u - 
If fc = 2, IF reduces to w, and therefore the validity of (11) is assured. 


6. Other asymptotic distributions. The only property of the range employed 
in the proof of the preceding section was its positive nature; consequently the 
proof is applicable to moving means of other dependent statistics that are posi- 
tive and possess third absolute moments. 

For example, the preceding proof can be applied to 5 2 to show that S 2 possesses 
an asymptotic normal distribution provided only that the sixth moment of x 
exists. In the study [1] of the efficiency of 6 2 f or x normally distributed, no proof 
was given of its asymptotic property. The preceding proof could be used in 
studying the efficiency of S 2 , or obvious generalizations of it, as measure of 
variation for non-normal populations. The normality of the asymptotic dis- 
tribution of the serial correlation coefficient could also be verified by means 
of this proof. 
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CONFIDENCE LIMITS FOR THE FRACTION OF A NORMAL 
POPULATION WHICH LIES BETWEEN TWO GIVEN LIMITS 1 


By J. Wolfowitz 
Columbia University 

Summary. Let y and cr be the unknown mean and variance, respectively, 
of a normally distributed population on which N independent observations 
xi, ■ • • , x N have been made, Let Li and L 2 , Li < I 2 , and a, 0 < a < 1, be 
given constants. We define the following symbols: 

(a) 7 - (VS,)- 1 /‘’ exp 1 * 

(b) x = N~ 1 2 x i 

(c) s 2 =(N - l) -1 2(a;, - z) 2 

(d) xl-a as that number for which P{x < Xi -4 = 1 — a where x has N — l 
degrees of freedom. 



It is proved that, under restrictions stated precisely below, and before the 
observations are made, the probability that D < y differs from a by a number 
which can be made arbitrarily small by making N sufficiently large. Thus an 
approximate (large sample) lower confidence limit for y is obtained. Similar 
methods can be applied to obtain upper and two-sided confidence limits. 

A problem raised by the present paper (but not attacked here) is to investi- 
gate the rapidity of approach to a of P{D < 7). It would perhaps be useful 
to obtain a series for the latter in powers of A -1 , the first term of such an ex- 
pansion is obtained here 

1 Formula (5 1) of the present paper was given without proof by the author in July, 
1945, in solution of a problem put to him by Dr M A Girshick At the time, both were 
members of the Statistical Research Group, formed in the Division of War Research of 
Columbia University under contract with the National Defense Research Committee of 
the Office of Scientific Research and Development The validation of formula (5.1) in 
all rigor as it is given in the present paper was constructed by the author after he was no 
longer a member of the Statistical Research Group 

In January, 1945, Professor A Wald, then a consultant to the Statistical Research 
Group, and the present author jointly submitted to the Group an unpublished memorandum 
(#410) entitled “Acceptance Regions Which Involve the Normal Distribution and Large 
Sample Sizes ” While this memorandum dealt with a different problem, its ideas weTe 
logically antecedent to formula (5 1) The present author wishes to express his indebted- 
ness to this memorandum and to his colleague Professor Wald. 
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1 . The problem. Let n and a be the unknown mean and variance, respec- 
tively, of a normally distributed population on which the N independent ob- 
servations X\ , x 2 , ■'■,£/, have been made. Let L\ and L 2 be given constants 
with Li < L 2 . We then have that 



is the fraction of the normal population which lies between Li and L 2 . The 
problem considered in this paper is to construct a lower confidence limit for the 
unknown 7, when N is large. An upper confidence limit or two-sided confidence 
limits may be constructed in a manner very similar to that described in the 
present paper. Since the construction of a lower limit is the problem which 
occurs most often in practice the discussion will be centered on it. 

A lower (confidence) limit on 7 with confidence coefficient a is a function 
D(xi , • • • , Xit) of the observations Xi , - • • , x# with the property that, before 
the observations are made, the probability is a that D(x 1 , ■ ■ ■, z tr ) < 7. In 
any specific application it is unknown whether this last inequality holds, because 
7 is unknown. However, one who proceeds as if this inequality were true is 
using a procedure which will give correct results 100«% of the time in the long run. 

When either Li = — °o or L 3 = + » the solution, by use of the non-central t 
distribution, is well known. For a description of the procedure and necessary 
tables the reader is referred to [1]. 

2 , Acceptance regions. Let y 0 be any value of the parameter 7. To 70 
there correspond infinitely many couples (n, a) with the property that the 
normal distributions characterized by these couples all have a fraction 70 lying 
between Li and ; we may write this symbolically by saying that the couples 
(a, <r) satisfy 

(2- 1 ) t(m, <0 =yo. 

The construction of confidence regions is equivalent to the construction, for 
every 70, of an acceptance region R(y 0 ) in the W-dimensional Euclidean space, 
with the property that every normal distribution whose parameters a and <r 
satisfy (2.1) assigns to R( 70) the constant probability a. While this property 
of similarity (cf. [2]) is sufficient for the construction of confidence regions, 
additional properties of the acceptance regions R(yf) are needed in order that 
the confidence region be an interval or that the upper confidence limit be always 
one (i.e,, that the confidence limits turn out to be a lower limit only), or to insure 
other features deemed desirable. 

It is easy to construct acceptance regions whioh will fulfill the condition of 
similarity. As an example, consider the case W = 3 for convenience. Let hi, 
ha , Di be a number triple such that hi + b, + h = 0 . Let R( 70), for any given 
70 , 0 < 7o < 1, consist of all the points x\ ,x 3 , x 3 which are such that the absolute 
value of the angle ^(— ir < ^ < it) between the vector (b % , bt , b 3 ) and the vector 
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(xi — x, x s — x, x 3 — x ) does not lie between vctry 0 and jt + ^01(70 — 1). (We 

define, in general, 2 *1 = Nx. The points (27 , x 2 , 27) for which 27 = 27 = x 3 

may be disregarded, since their probability is zero when the distribution is con- 
tinuous.) One readily verifies that the probability of R( y 0 ) for any y Q is a, 
no matter what y and <r are, and hence this is true in particular for the pairs 
which satisfy (2.1). 

The above method of constructing acceptance regions yields confidence regions 
which, while they cover the unknown 7 with confidence coefficient a, are not 
very meaningful otherwise, The fact that. the probability of R( 70) is a whether 
or not (a, a) satisfies (2.1) is already indicative of their lack of discrimination. 
Since x and s (where s is defined by 

ns 2 = y. (x. — S) 2 
1 

and n = N — 1 ) are sufficient estimates of a and a, which in turn determine 7, 
it is clear that desirable confidence regions should be functions only of x and s. 
Consequently our first task. must be to construct the acceptance regions R(ya) 
in the x, s plane. In the present paper we construct in the x, s plane regions 
R( 70) which have the property that their probability, under any normal dis- 
tribution whose parameters satisfy (2.1), differs from the prescribed a by a quan- 
tity which is bounded in absolute value for all 70 , in such a way that the bound 
approaches zero as N increases. Thus when the sample number is sizeable we 
can obtain confidence regions for 7 which correspond to a confidence coefficient 
which differs little from a. Finally, the acceptance regions R{ 70) which we 
shall construct will be such that the confidence region will be always an interval, 
and the upper limit will always be 1, i e., we will construct a lower confidence 
limit for 7. 

3 . Construction of regions R (70) in the x, s plane. First we describe two 
assumptions which we shall make. It is believed that these are reasonable 
from the practical standpoint and are satisfied in most actual investigations 
where the present problem arises. Mathematically their purpose is to enable 
us to secure a uniform bound on the difference between a and the probability 
of R(yo) (for all y 0 ) under all couples {n, <r) which satisfy (2.1). 

Assumption 1: There exists a positive d such that 

Li -|- d <C p < L2 — d. 

In most practical cases where the present problem will occur 7 will be larger 
than If the latter is the case and y. were very near either Li or L 3 , then a 
would have to be very small. In that case other methods would have to be 
used in the solution of the practical problem. The present paper deals with 
the situation, unfortunately only too co mm on in practice, where a is not too 
small. Assumption 1 puts a lower bound on a for any given value 70 . (The 
bound is a function of 70). 
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Assumption 2: The standard deviation a is less than a 'positive number C. 

In most practical problems such an upper bound can reasonably be set. 
Naturally, the larger d and the smaller C the more a priori information is at 
our disposal, the closer are our approximations and the narrower our limits. 
The effect of Assumptions 1 and 2 is to place a lower limit 6 on y where 

G = 7 (Ia + d,C) = y (L 2 - d, C). 

Let to be any positive number such that G < t 0 < 1. For an x such that 
Li < x < Li, let r{x, y 0 ) be the positive number such that 

y(x, r{x, To)) = To ■ 

We define xi-« to be that number for which 

P(x < x?-«) — l — a, 

where x has n degrees of' freedom and P is the probability of the relation in 
parentheses. The number xi-« may be found in tables of the x 2 -distnbution 
if(the value of a is one of those in common use. Finally define 

To) = r(£, td) A/ 

V n 

The acceptance regions A(to), G < to < 1, which we shall employ, are defined 
as follows for any To , C? < To < 1 : 

Lx is x <■ Li 

8 > <p{x, To). 


4. Proof that P{A(yo)} ~ «. This section will be devoted to a proof of the 
following : 

Theorem. Let A (to) be as defined in Section 3 for G < To < 1. Let the assump- 
tions 1 and 2 of Section 3 be fulfilled. Then the absolute value of the difference 
between a and the probability of A(yo) under any couple (m> c) which satisfies (2,1) 
is less than any arbitrarily small positive e when N is sufficiently large, i.e., when 
N is sufficiently large, 


|P(A(to)} - a | < 6 

uniformly for all {a, c) which satisfy (2.1) with G < to < 1, and which fulfill 
Assumptions 1 and 2. 

Lemma 1. dr{ *> yo) 


dx 

Proof: We have 


exists in the open interval Lx < x < L 2 . 


7° = T/ftt 


V 2ifr(x, To) 


O xp f ^ Civs)!* 


AVt-Dlr ( 
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Differentiating with respect to x we obtain, since r > 0, 


e 


,-B 2 /2 


( I+fi g)- e --.( 1 + I .g) 


with 

Hence 

(4.1) 


R = 


Li 


T = 


Li — x 


dr _ _ / <f* s ' 2 - <T rJ/2 \ 
Sz V2e- B5 ' 2 - re- rS 'V ‘ 


Since R > 0 and T < 0 within the open interval Ly < x < Li , it follows that 
dr 

— exists m the entire open interval. 

Lemma 2. In the open interval Ly < x < L 2 , 

dP (s > ip(x, 7o){ 
dx 

exists. 

Proof: We have, with k a suitable constant, 


P = k f i/-V uS/2 dy = k 

■\ZnvU "(Hts 


Hence 

(4.2) 


■y/nvl' 

dP _ — k\ 1-, 
dx 


ro)/»)xi-o 


y^e^'dy. 




Lemma 3. Let 8 be any arbitrarily small positive number. The junction 

of x and 70 is bounded for Lj + 8 < x < L« — 8, G < 70 < 1. 

Proof: From (4.1) we have 


dP 

dx 


dr 

dx 


< 


e 




+ e 


—T 1 12 


R e -RVl _ rp e -TV 1 
Therefore from (4.2) we have that 


^ / 1 — 1 \ ( r r \ . r 

- \R T ) \L 2 - x' x - LJ8 

is less than a constant multiplied by 


dP 

dx 




and is therefore bounded. 


Proof of the theorem: From Lemma 3 and the Theorem of the Mean it 
follows that, in the closed interval 

Ly + \<X<Li-\, 

the function P{s > <p(x, 70)} is unif ormly continuous in x uniformly for all 
0*, 0 ) which satisfy (2.1) with 0 < 70 < 1. Hence for every positive «i there 
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exists a positive 17 < - such that \h — h\ < y, 

ii + 2 — h , h < Lt — ^ , 


implies 


I -P(s > v>(k 1 To)) - -P{s > v(h , To)) 1 < «i . 
For fixed arbitrary > 0 we have, when N is sufficiently large, 
P(|* — m|<ij}>1 — **, 

from Assumption 2 and the stochastic convergence of x. Now 

P(s > 70)) = a. 

Hence, when N is sufficiently large, 

I P(P(to)) — « I < «i(l — h) + «a < <» + <2 ■ 
Since «i and £2 are arbitrarily small, this proves the desired result. 


6. Construction of large sample confidence regions. The acceptance regions 
R(7o) whose size never differs from a by more than a uniform bound which 
approaches zero as N increases, readily yield a lower confidence limit for 7 
(within the approximation involved) . The confidence region consists of all the 
70 for which R( 70) contains the observed x, s. Our acceptance regions R( y 0 ) 
are so constructed that, if 71 < T2 . R (t 1) is entirely contained within R (72) . Hence 
the confidence region is an interval, one end of which is always unity, as was 
desired. The rule for constructing the lower confidence limit D is, therefore, 
as follows: 

a) if £ < Li or x > L 2 , then D ~ G 

b) if Li < f < Ls , then 

(fi.i) D = / exp {- 4 y*j dy 

where 

w = Va - 1 ■ — . 

Xl-a 

(The value of D may be found in a table of the normal distribution. It is easy 
to see that s = <p(x, D), i.e., D is the smallest value of 70 for which x, s will still 
lie in R{ 70)). 

If the statement D < 7 is made in a large number of cases, where the assump- 
tions are fulfilled and the sample size is large, the proportion of correct statements 
will be close to a. 


REFERENCES 

[1] N L Johnson and B. L. Welch, “Applications of the non-central <-distribution,” 

Bicmetrika, Vol 31, Parts III and IV (1940), pp 362-389 

[2] J. Neyman, “Outline of a theory of estimation,” Phil Trans, Roy. Soc. London, series 

A, Vol 236 (1937), pp 333-80. 



NOTES 

This section is devoted to brief research and expository articles on methodology 
and other short items. 


ON SEQUENTIAL BINOMIAL ESTIMATION 

By J. Wolfowitz 
Columbia University 

The present note, written after a reading of the very interesting paper by 
Girshiek, Mosteller, and Savage [1], is for the purpose of adding a few remarks 
in the nature of a supplement. For the sake of brevity the notation and ter- 
minology of [1] are adopted in toto. 

Theorem 1 below generalizes Theorem 1 of [1], In Theorem 2' we formulate 
explicitly the fact which lies at the basis of the GSM method of estimation . Parts 
of the proofs of Theorems 3 and 4 of [1] are simply proofs of special cases of this 
(e.g., equation (2) of [1]). We then use thiB fact repeatedly in proving Theorem 
3, which states that the Girshick-Mosteller-Savage estimate is the only proper 
unbiased estimate for sequential tests defined by regions which we shall call 
doubly simple. 

A doubly simple region is defined precisely below. Intuitively we may de- 
scribe such a region as the one between two curves y — fi(x) and x = f 2 (y), 
where fi(x) is defined and monotonically non-decreasing for all non-negative 
x, f 2 (y ) is defined and monotonically non-decreasing for all non-negative y, 
/i(0) > 0, / 2 (0) > 0. If the two curves intersect, the region is finite, and the 
values of the functions j\ and / 2 beyond the point of intersection are of no inter- 
est This description is of course purely heuristic, because in actual fact only 
integral values of the variables come into play, and intersection of the curves, 
for example, is not needed to make the region finite. Since the question of finite 
regions is completely settled by [1], Theorem 7, only non-finite regions remain 
to be discussed, and the precise definition given below is such as to imply that 
the region is not finite. It seems to the present writer that at least many of the 
non-finite sequential tests which may be developed for meaningful statistical 
problems will require doubly simple regions. The Wald sequential binomial 
test [2] defines such a region, which also falls within the scdpe of Theorem 6 of 
[1]. It is easy to see that there exist closed regions which are doubly simple 
and do not satisfy the conditions of this theorem. 

By a ‘ ‘proper” estimate p(a) we shall mean an estimate such that 0 < p{a) < 1 
for every a. It is difficult to see how any estimate which is not proper can 
make much sense. 

Theorem >1. A sufficient condition that a region R be closed is that lim inf 

n— 

< oo , where A ( n ) is the number of accessible points of index n. 
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Proof: The hypothesis of the theorem implies that there exist a positive 
number H and an increasing sequence of positive integers «i , n 2 , n 3 , • • , with 
the following properties : 

a) n i+1 > 2 n, (i = 1, 2, ■ ■ ■ ad inf.) 

b) d.(n,-) < H s/n, ■ 

For n, sufficiently large, the conditional probability of reaching the accessible 
points on % -f- y = n.+i , when an accessible point on x + y = n, has been 
reached, is < if <1 by the normal approximation to the binomial distibution, 
where K is constant (and depends on H) . Hence the probability of passing 
through accessible points on all members of the set x + y = n, (i — 1, 2, 
approaches zero as L — * °° , so that the region is closed. 

Theorem 2. Let R be any region, B its boundary, and t = (a, b ) ; any accessible 
point in R. Let l t {a) be the number of paths from t to (x, y) — a eB. Let Q(t ;) 
be the conditional probability that a path, which has reached t, will reach the boun- 
dary B. Then 

£i,( a )pV = Q(0 pV- 

a tB 

Theorem 2'. (Corollary to Theorem 2) 

If R is closed, then 

( 1 ) = 

a tB 

Proof: Let k(t) be the number of paths in R from the origin to t. The 
probability of reaching a e B by a path which passes through t is k(t)l t (a)p v q x . 
The probability of reaching t from the origin is k{t)p b q a , and hence the prob- 
ability of reaching the boundary via i is Q{t)k(t)p b q‘ . From this the desired 
result follows. 

We now define a doubly simple region. The boundary of the region consists 
of the two infinite sequences of points 


(0, Ofl), (1, Oi), (2, af), ■ • • 


and 


(bo , 0), (bi , 1), (b, , 2), • ■ ■ 

where a 0 , ai , a 2 , ■ • • and b 0 , b \ , hi , - • • are two infinite non-decreasing se- 
quences of positive integers. The accessible points of the region are all points 
which can be reached by a path from the origin which does not contain a boun- 
dary point. (It is to be noted that since a boundary point is, by definition, 
a point not in the region which can be reached by a path in the region, the above 
definition implies that a doujjly simple region is not finite. The reason for 
making this so has been given above.) 

Theorem 3. Let R be a closed doubly simple region. Then fi(a) is the unique 
proper unbiased estimate of p. 
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Proof: Suppose there -were two proper unbiased estimates pi(a) and ps(a). 
Writing m(a) = pi(a) — pz(a), we would have 

(2) 2 m(a)k(a)p' , q x = 0 

a t3 

with 

(3) | m(ct) | < 1 
First we prove 

Lemma 1 . If a 0 > 1 , then m(b 0 , 0 ) = 0 . 

Proof: Let fc*(a) denote the number of paths in R from the point (0, 1) to 
the boundary point a. For all points a « B except (bo , 0) we have 

(4) bob* (a) > fe(«). 

From (1), (2), (3), and (4) we have, since k{b a , 0) = 1, 

| m(b o , 0) | g b ° = I £ m(a)k(a)p v q* I 

I « «fi,a^(6 0 ,0) I 

(5 ) , .. 

< Z kMpY < h E pWpY = h P . 

a lB,ctf6(bo,Q) etiB 

Now as p — * 0, the left member of the inequality (5) approaches | m(b 0 , 0 ) |, 
and the right member approaches zero. This proves Lemma 1. 

Lemma 2. For every z < a 0 — 1, m(b , , z) = 0. 

Proof: In view of Lemma 1 it is sufficient to prove the following: 

If .Z < a 0 — 2, and if m(b, , z) = 0 for z = 0, 1, Z — 1, then m(b z , Z ) 
= 0. Let fc z+ i(a) denote the number of paths in R from (0, Z + 1) to the 
boundary point a. For any point a e B whose ordinate is > Z + 1 we have 

(6) bobx • ■ ■ bzkz 41(a) > k(a). 

From (1), (2), (3), and (6) we have 

(7) | m(bz , Z) | k{b z , Z)p z q hz = | 2m(a)fc(«)pV I < S/c(a)pY 

^ b 0 bi * • • bz2/c^4l(o!)p v Q , = bobi • • * bzp 2 ~^ 

Where the summations take place over all boundary points whose ordinates are 
> Z + 1- Hence 

| m(b z , Z ) | fc(b z , Z)q bz < b 0 bi • ■ ■ bzp. 

and letting p —* 0 we obtain the desired result. 

Lemma 3 . m{b ao ~ 1 , a B — 1 ) = 0 . 

Proof: Let s be the smallest integer such that (s, Oo) is an accessible point. 
We proceed as in Lemma 2, with (s, Oo) playing the role of (0, Z + 1), and 
eventually obtain the following inequality: 

| m (&„„-!, a 0 — 1) | &(b„o-i, a 0 — l)p“ 0 ~V‘ ,6_1 = | So m{a)k(a)p v q x \ 
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where 2 0 denotea summation over all boundary points with ordinate > o» - 
The desired result follows. 

Lemma 4. Let h{> of) be the smallest ordinate for which at least one boundary 
\ point (w*, h ) exists such that m(w*, h) 9* 0 {If no such h exists the theorem is proved). 
Of all such points let w be the me with the smallest abscissa. Then the point (w, h) 
is a member of the sequence 

(0, Go) (1, af), (2, of), • * • 

Proof: If the lemma is not true, then for all boundary points a with ordinate 
h, m(a) = 0, except that m(h , h) 0. Let W be that accessible point of R 
whose ordinate is h 4- 1 and whose abscissa v is a minimum. Let k w (a) be the 
number of paths in R from W to the boundary point a. For boundary points 
a accessible from W we have 

(9) bah • • ■ bhkw(a) > k(a). 

From (1), (2), (3), and (9) we have 

(10) | m(b h , h) | k{b h , fc)pV* = | ^{m(a)k(a)p u q x [ < 2 S fc(a)p* + Y 

+ bob ! • • • b H p h+1 q v = K*p h+ \ 

where: 

a) Si denotes summation over ah a e B for which y > h 

b) Sj denotes summation over ah boundary points a of ordinate h + 1 and 
abscissa < v. 

c) K * denotes a constant. 

From this it easily follows that m(b h , h) = 0, in contradiction to the definition 
of h . This proves Lemma 4. 

Proof of Theorem 3 : Let (w, h) be as defined in the statement of Lemma 4. 
From Lemma 4 it follows that, if any other boundary points with abscissa w 
exist, they must be members of the sequence (6 0 , 0), (t»i , 1), (f> 2 ,2), • ■ ■ and 
hence their ordinates are < h. From the definition of {w, h ) and from Lemma 4 
it follows that for any a « B whose abscissa is < w, m(a) = 0. 

Now in the proofs of Lemmas 1-4 the roles of x and y are not symmetrical. 
However, symmetry of course exists, and analogous lemmas follow. In par- 
ticular, the analogue to Lemma 4 has as a consequence that, since w is the 
smallest abscissa such that m(a) = 0 when abscissa of a < .w, and m(w, h) 9 s - 0, 
there exists a boundary point (w, h r ), such that m(w, h ') 9* 0 and (w, h') is a 
member of (6o ,0), (t)i , 1), (bo , 2), ... Then h' < h. But this contradicts 
the definition of h and proves the theorem. 

It is easy to see that, if the boundary points of a closed region constitute 
a single “curve” instead of two "curves'’ as in a doubly simple region, the 
estimate p(a) will be the only proper unbiased estimate of p. 

It is interesting to consider some of the consequences of Theorem 3 for all 
unbiased estimates (not necessarily proper) for doubly simple regions. An 
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examination of the proof of Theorem 3 shows that it would go through with 
little change if equation (3) were replaced by the requirement that | m(ot) | 
be bounded. We therefore obtain the following result: If for a doubly simple 
region there exists an unbiased estimate p(a) of p, not identically equal to p(a), 
then not only is p(a) not proper, but also, no matter how large M, there exists a 
boundary point a such that \p(at)\ > M. The uselessnesB of such an estimate 
is manifest. 

The author is of the opinion that freedom from bias is not necessarily an in- 
dispensable characteristic of an optimum estimate. In general there is no 
reason for requiring the first moment of the estimate rather than any other 
moment to be the unknown parameter. The justification in any particular 
case must be based on special conditions of the problem. 

The author is indebted to Mr. Howard Levene for reading the present paper 
and making "valuable suggestions. 
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DIFFERENTIATION UNDER THE EXPECTATION SIGN IN THE 
FUNDAMENTAL IDENTITY OF SEQUENTIAL ANALYSIS 

By Abraham Wald 
Columbia University 

1 . Introduction. Let [z a ] (a = 1 , 2, • , ad inf.) be a sequence of random, 
variables which are independently distributed with identical distributions. 
Let a be a positive, and b a negative constant. For each positive integral value 
m, let Z m denote the sum zi + ■ • • + z m ■ Denote by n the smallest integral! 
value for which Z n does not lie in the open interval (J>, a). For any random 
variable u, let the symbol E(u) denote the expected value of u. The following 
identity, which plays a fundamental role in sequential analysis, has been proved 
in [1]. 

(1.1) E[e z "‘ l p(tr n ] = 1, 

where 

(1.2) «>(t) = E( O 

and the distribution of z is equal to the common distribution of z l , z 2 , • ■ ■ , etc. 
Identity (1.1) holds for all points t in the complex plane for which <p(t) exists 
and | <p(t) | > 1. 
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The purpose of this paper is to formulate conditions under which we may 
differentiate (1.1) with respect to t under the expectation sign. This is of 
interest, since variohs results in sequential analysis can easily be established 
by differentiating (1.1) under the expectation sign. For example, the formula 
for E(n) can immediately be obtained by differentiating (1.1) at t = 0. The 
derivative of e Zn V(/) -n at t ~ 0 is given by 

(1.3) Zn -^n = Z n - E(z)n 

where ip'(i ) denotes the derivative of ip(t). Hence, if we may differentiate (1.1) 
under the expectation sign, we obtain the basic formula 

(1.4) E(Z n ) = E(z)E(n). 

If E{z) 5 ^ 0, the above equation has been used [2] to derive lower and upper 
limits for E(n). If, however, E(z) = 0, formula (1.4) is of little value. It will 
be shown in section 3 that 

(1-5) E(n) = when E(z) = 0. 

This result is obtained, as will be seen in section 3, by differentiating identity 
(1.1) twice at t — 0. 


2. A sufficient condition for the differentiability of (1.1) under the expectation 
sign. In what follows, the parameter t in (1,1) will be restricted to real values, 
even if this is not stated explicitly. For any random variable u and any relation 
R , the symbol E(u j R) will denote the conditional expected value of u under 
the restriction that R holds. In this section we shall establish the following 
theorem. 

Theorem 2.1, If <p(i) exists for all real values t, identity (1.1) may be differen- 
tiated under the expectation sign any number of times with respect to t at any value 
t in the domain <p(t ) > 1. 

Proof: First we shall derive an upper bound for E(e ‘ z " | n = m) for any 
given integral value m. Consider the case when t > 0. Then 

(2.1) E{e iZn | n = m) < E(e lZm \Z„ > a, n = m) (f > 0). 

Clearly, 

(2.2) E(e tz - | Z n >a,n = m, e 42 "' 1 - pe ai ) = e at P E (e“ | a' 4 > ^ . 


Let l{t) denote the least upper bound of the expression 


(2.3) 
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with respect to P over the interval (e ‘',1). The existence of v (t) implie 8 
that l(t) is finite. It follows from (2.1) and (2.2) that 

(2-4) E(e iz '\n = m) < e a ‘l(t ) ({ > 0) 

and, therefore, also 

(2-5) E(e‘ z ") < e a ‘l(t) ( t > 0). 

If t < 0, one can show in a similar way that 
(2 6) E{e tz " | n = m) < e ht l(t) ( t < 0) 

and 


(2-7) E(e tz *) < e ll l{t) « < 0). 

To prove Theorem 2.1, it is sufficient to show that the following two proposi- 
tions hold. 1 

Proposition 2.1. All derivatives of e Zni <p(t)~ n with respect to t exist in the 
domain <p(t ) > 1. 

Proposition 2.2. For any positive integral value r and for any finite interval I 
in which <p(t) > 1, it is possible to find a function D(Z„ , n) such that 


( 2 - 8 ) 


D(Z n ,n ) > 




for all values tin I and 


(2 9) E[D{Z n ,n)] < w. 

Proposition 2.1 is clearly true, if all derivatives of <p(t) exist. The existence 
of these derivatives follows from the existence of <p{t) for all values t. 
d T 

Since ^ e " <p(t) ” is equal to the sum of a finite number of terms of the type 

Z r fri" i e Znt <p(i)~ n , Proposition 2.2 is proved if we can show that for any given 
integral values r L and ri there exists a function D TlT2 {Z n , n) such that 

(2.10) D riT2 (Z n , n) > | Z r fri l e z "‘<p{tT n | 

for all t in I and 


(2.11) E[D nf2 (Z n , n)) < oo. 

Clearly, since <p(t) > 1 in I, 

(2.12) | Z T n 1,1 e z " t <p(i)~ n | < ] Z r f | rf'e ' z "’ l ° 

where k is an upper bound of 1 1 \ in I. Let t\ be a value > U . Then for a 
properly chosen constant C we have 

(2.13) | Z r f | e |Znl ‘° < 'Ce 1 Zn|tl . 


1 See, for example, E J. McShane, Integration, Princeton University Press (1944), p. 
216, 217 and 276. 
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Hence, it follows from (2.12) and (2.13) that 

(2.14) | ZW-Wr" I < On'*e ,JW, ‘ < Cn Tl (c z ' h + e~ z "“) 

for all t in I. 

We put 

(2.16) D nri (Z n , ») = Cn'V" <l + <r z »‘>). 

We have 

(2.16) i5tDr,r,(Z n , »)] = C £ p m m rj [E(e z "' 1 1 n = m) + E( e - Z "‘' I n = m)] 

where p m denotes the probability that n — m. 

Hence, because of (2.4) and (2.6), we obtain 

(2.17) E[D nrt (Z„, n)] < C(e ai %k) + <r*‘V(~f,)pp m m r ’] - 

= C[e ah m + 6 _6 ‘ l I(-f 1 )]S(n r ‘). 

Since all moments of n are finite, 2 Proposition 2.2 is proved. This completes 
the proof of Theorem 2.1. 


3. The expected value of n when E(z) ~ 0. It will be shown in this section 
that 


(3.1) 


m 


E(Zi ) 

E(z 2 ) 


when E(z) = 0, 


if identity (1.1) can be differentiated twice under the expectation sign at t = 0. 
The second derivative of with respect to t is given by 


(3.2) 



v(t). 


„ - r wn 

bW / 


e z »V(0~" 


where ip'(t) denotes the first, and the second derivative of 
Since <p(0) = 1, <p'{ 0) = E{z) = 0 and <p"( 0) = f?(z 2 ), putting t — 0, expression 
(3.2) becomes 


(3.3) 


Z\ - rup"{ 0) = Z\ - nE{i) 


Hence, if (1.1) may be differentiated twice under the expectation sign at t = 0, 
we obtain 


(3.4) E[Z\ - n£(z 5 )] = 0 
from which (3.1) follows. 

An approximate value of E(ri) can be obtained from (3.1) by neglecting the 
excess of Z n over the boundaries. Then Z n can take only the values a and 
b. Hence 

(3.5) E(Zi) ~ a?P(Z n > a) + b l P{Z n < b) 
where the sign ~ denotes approximate equality. 

’See the paper by C. Stem, “A note on cumulative suns,” in this issue of the Annuls 
of Mathematical Statistics. 
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It was shown in [1] (equation 28) that neglecting the excess of Z n over the 
boundaries, the approximation formula 

1 bh 

(•■«> > “> ~ 

holds, where h is the non-zero root of the equation <p(t) = 1 . This formula was 
derived there under the assumption that E(z) ^ 0. If E{z) approaches zeTO, 

-b 

h — > 0 and the right hand member of (3.6) converges to - _ - . 

Putting P(Z„ > a) = - — and P(Z„ < b) = 1 — = - _ - , we ob- 

tain from (3.5) 

0.7) «(z*o ~ (rri) + »* rh - -<*• 

Hence 1 

«— (if) 

(3.8) H(«-) ~ E(z*) ‘ 

Limits for S(n) can be obtained by denving limits for E(Z\). Let r be a 
non-negative real variable. One can verify that 

(3.9) a H(Z s n | Z„ > a) £ l.u.b. E[(a -r + z) 2 \z>r] 

0 <r<a-b 

and 

(3.10) b 2 g E(Z\ | Z n ^b) < l.u.b. E[(b + r + z ) 2 1 s + r < 0]. 

0<r<o— & 

We have 

(3.11) E(Zl) = P(Z n > a)E(Zl \Z n > a) + P(Z n < b)E{Z\ \ Z n < b) m 

Limits for E(Z\) can be obtained by replacing the conditional expected 
values in the right hand member of (3.11) by their limits given in (3.9) and 
(3.10). 
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A NOTE ON CUMULATIVE SUMS 

By Charles Stein 


Columbia University 


Let (Z,} be a denumerable sequence of identical independent real-valued 
random variables Two constants a > 0 > b are chosen and the random variable 

n 

n defined as the smallest integer for which one of the inequalities 7") Z, > a, 

1 

n 

22 Z, < 6 holds. Tor any events E\ and Ej , 7 J (Ei) will denote the prob- 
i 

ability of the event E x and P[E X \ Ei\ the conditional probability of the event 
Ei given that Ei has occurred. 

It wdl be shown that there exists U > 0 such that the moment generating 
function, Ee n 1 exists for any complex number i whose real part is less than or 
equal to k , and as an immediate consequence that n has finite moments of all 
orders. 

If d is any constant satisfying b < d < a, then, for fixed m, 


(1) 




where c = | a | | b |, We exclude the case P(Z, = 0| = 1. Then there 

exists € > 0 such that either 


Si = P[Z , > «} > 0 or S 2 = 7 J |Z t < -e) > 0. 


Taking, for example, the' former alternative with mi 



1, 


( 2 ) 




5p > 0 


where [tu] denotes the largest integer less than or equal to w. For any poitive 
integer k, 


Pi ^ r | - p l” > I " > <* - «”‘l 


Ami 


<P<b<^Z x <ab<^Zi<a for s — 1, * - • , (h ~ 1 )tnx 
x 1 


hm i 

since n > km implies 5 < 22 Zi < a. 

1 

Amj Ami 

But 22 Z v = 22 22 and the second sum on the right hand 

1 1 (t-l)mi-t-l 

side is independent of all terms in the first sum. 
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Thus the distribution of E Z, given E Z, for s = 1 ,•••,(&- l)r»i de- 

1 1 

(ft— 

pends only on E Z, so that 
i 

pj — — 7T — > < P^ & < E 4* E z t <ab< E Z t < a? 

P[n>(k- ljtnij ( (t-«i,+i i i J 

<P E z x <c <1-3? by (1) and (2). 
\ (A— l)ffli+l J 


Consequently, by induction on k, 


P[n > m} < P jn > ~ mij < (1 - ii l )W . 


Let t Q be any positive number less than - — log (1 - S” 1 ). 

trii 

Then 


Pe"'° = E V"‘°P{» = m} 

ee 

<Ee imih P{(k - 1 K <n < fcmi) 

ftsnl 

(5) < E e kmi ‘°P[n > (fc - l)mj) 

< e; e* mi ‘ o (i - 

*-i 

= 

1 — fa it— l 

But this is a geometric series with decreasing terms, and is consequently con- 
vergent. Thus for any t whose real part R{t) < to , the moment generating 
function Ee ni exists. Since, for all positive l, m l < e nt ° for sufficiently large 
m, n has finite moments of all orders. 
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1. A Test of Randomness in 'Two Dimensions. Howard Levene, Columbia 

University. 

A square of side N is divided into N 1 unit cells, and each coll takes on the characteristics 
A or B with probabilities p and q = 1 — p respectively, independently of the other cells. 
A cell is an “upper left corner" if it is A and the coll above and cell to the left are not A. 
Let F, be the total number of upper loft corners and let Fi , F a , F 4 be the number of simi- 
larly defined upper right, lower right, and lower left corners respectively. Let F = (Fi -f- 
Vi + Fa + F ( )/4. It is proved that F is normally distributed in the limit with E( V) = 
p(Nq + pY and <r J (F) ^ JV ! p? 2 (4 — 20p 4- 45p 5 — 27p*)/4. The conditional limit distribu- 
tion of F when p is estimated from the data, and the limit distribution of a related quadratic 
form are also obtained These statistics are in a sense a generalization of the run statistics 
used for testing randomness m one dimension. 


2. Asymptotic Distribution of Moments from a System of Linear Stochastic 
Difference Equations. Herman Rubin, Cowles Commission for Research 
in Economics. 


Let YZ* B'V't-r + fzl = «!,(( = 1,2, •• •). be a complete system of linear stochastic 
difference equations determining yi, (the coordinates of j/,), t > 0, in terms of j/„ , f <; 0, 
and z a (the coordinates of z,), which are assumed to be fined variates, and the random 
variables w<, (the coordinates of u ,) Such a system is called a stable if for every bounded 
set of fixed variates, and E(u\ut) uniformly bounded, E(y',y ,) is uniformly bounded. This 
condition is shown to be equivalent to ^ I ^ l >;r I finite, where y\ ■= BrO*!- r ~ r *!-r) 

+ AiF-v > b the solution of the above difference equation. Let Qt be an infinite 
quadratic form in and z,_ Kl * (t, v = 0, 1, • > •) with coefficients depending only on i, k, 
t, and v. Such a quadratic form is called convergent if the sum of the absolute values of 
the coefficients is finite. It is shown under fairly genoral conditions that the mean of a 

convergent quadratic form ib asymptotically normally distributed with variance 0 


G> 


3. Conditional Expectation and Unbiased Sequential Estimation. David 

Blackwell, Howard University. 

It is shown that E\f [x a )E a y] = E(fy) whenever E(fy) is finite, and that a 2 (E„y) < <r 2 {y), 
with equality holding only if E a y = y, where E a y denotes the conditional expectation of y 
with respect to the family of chance variables x a . These results imply that whenever 
there is a sufficient statistic u and an unbiased estimate £, not a function of u only, for a 
parameter p, the function EA, which is a function of u only, is an unbiased estimate for p 
with variance smaller than that of t. A sequential unbiased estimate for a parameter is 
obtained, such that when the sequential test terminates after i observations, the estimate 
is a function of a sufficient statistic for the parameter with respect to these observations. 
A special case of this estimate is that obtained by Girshick, Mosteller, and Savage (Annals 
of Math, Stal., Vol. XVII (1946), pp, 13-23) for the parameter of a binomial distribtion. 


4. A Discussion of the Ehrenfest Model. Preliminary report. Mark Kac, 
Cornell University. 

A particle moves along a straight line in steps A, the duration of each step being r. 
The probabilities that the particle at kA will move to the right or left are (1/2) (1 — k/R) 

600 



ABSTRACTS OF PAPERS 


501 


and (1/2) (1 + k/R) respectively. R and k are integers and | k | < R, M. C. Wang and 
G. E. Uhlenbeck in their paper On the theory of Brownian motion II ( Rev Mod. Phys. Vol. 
17 (1945), pp 323-342) discuss this random walk problem and state several unsolved problems. 
In answer to some^of the questions raised the following results are obtained Let (1 — z) rt ~ l 
■ (1 -'r z) R + 1 = 2CV z 0 an integer) then, the probability P(n, m | s) that a particle starting 
from nA will come to mA after time t = sr is equal to 2 _iR (-l) fi+n sO/E)'C^VCs+m . 
where the summation is extended overall j such that \] | < R Also, if R is even the prob- 
ability P'(n, 0 | s) that the paiticle starting from nA will come to 0 at t = sr for the first 
time is calculated. Eor n = 0 this gives a solution of the so-called recurrence time problem 
first studied on simpler models by Smoluchowski. Through a limiting process in which 
t — > 0, A — > 0, A 2 /2r — > D, 1/Rt — » |8, nA — > x 0 , m A — > x> sr = t, one is led to fundamental 
distributions concerning the velocity of a free Brownian particle In particular, P(n, rn | s) 
approaches the well-known Ornstem-TJhlenbeck distribution 

5. Sampling from Contaminated Distributions. Preliminary report. John W. 

Tukey, Princeton University. 

A contaminated distribution is a nearly normal distribution in which extreme observa' 
tions aie more frequent than in a normal distribution. By studying the biaB and vari" 
ability of several measures of dispersion when applied to samples from particular one' 
parameter families of contaminated distributions it is shown that (l) for nearly norma' 
distributions, the mean deviation is often better than the standard deviation; (n) amal' 
changes in the underlying distribution may increase the sampling variance of the standard 
deviation by a factor of three This suggests that, in a broad class of cases, the mean devia- 
tion is safer than the standard deviation when a single dispersion is estimated from a set 
of data This conclusion need not apply in an analysis of variance situation. 

6. On the Class of Functions Defined by the Difference Equation ( x + 1 )/(x + 1) 
= (a + bx)f(x). Leo Katz, Wayne University 

The difference equation defines only three discrete functions the binomial, the Poisson 
and the Pascal functions, the first and third have one parameter ( N ) slightly generalized. 
It is shown that the Pascal function with this generalization is identical with the Polya- 
Eggenburgher distribution, which is a very useful form of the Compound Poisson Law and 
has been used to explain probability situations involving contagion Areas foi all func- 
tions m the class are given in terms of existing tables of the incomplete 7 and /3-functions 
Observed distributions are fitted by two moments. As Carver ( Handbook of Mathematical 
Statistics ) pointed out, the advantages of fitting by difference equations are many, not the 
least is the fact that it is unnecessary to discriminate among the various functions in fitting 
an observed distribution. The problem of discrimination, posed by Frisch ( Metron , Vol. 
10) and others, may be resolved m terms of the sampling distribution of variances for the 
Poisson function, since the three functions correspond to situations where the variance is 
less than, equal to, or greater than the mean, respectively 

7. Retention of Decimal Places in Matrix Calculations. Franklin E. Satter- 

thwaite, Aetna Life Insurance Company. (Read by title) 

The accumulation of errors in matrix calculations has been studied by the author and 
others for special types of matrices and for special methods of calculation In the present 
paper, error formulae are developed for the standard Doolittle and Waugh-Dwyer Compaot 
routines. These formulae do not place any restrictions on the matrices involved and do 
not require any extra calculations or initial approximations. Simple rules are developed 
which give for each step in the calculations the number of decimal places which must 
be retained, These rules are efficient m the sense that the retention of fewer places will, 
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except for good fortune in balancing of errors, lead to results less accurate than those 
specified The rules also assist in choosing that arrangement of the calculations which 
will lead to the smallest average number of significant figures which must be retained 
for the calculation as a whole. 

8. The Efficiency of the Mean Moving Range. Paul G. Hobl, University of 
California at Los Angeles. (Read by title) 

The statistic w = J]" -1 | i.+i - xt \ \Zr/2(n - 1) is studied as an estimate of a for a 
normal variable subject to trend effects. It is shown that tho efficiency of w compares 
favorably with that of tho mean square successive difference, <5*. The proof that w, and 
also 5 1 , is asymptotically normally distributed is made to depend upon a general result 
that can be derived from a theorem of S. Bernstein on dependent variables. 


9. Some Basic Theorems for Developing Tests of Fit for the Case of the Non- 
Parametic Probability Distribution Function. Bradford F. Kimball, 
New York State Department of Public Service. (Read by title) 

Given a universe with C D.F. P[X < x] = F(x). Consider a random sample of n values 
Xi which have been ordered so that x, < Xi+i . Tho successive differences of tho true c.d.f. 
values at X = %< are denoted by ut . Thus 


~ f(Xi) 

m = F(x .) - F(x,- 0, 2 £ i < n 
u n+l - I - F(x n ). 

Theorem 1. The product power moments 

£?(«*?•••) 

for any or all different indices from 1* to n + 1, where the powers are real numbers greater than 
minus one, are given by 




r(n + i) Tip + 1) rCq + D r(w + l) • ■ • 
r(n +1+P+9+W+ * * *) 


Corollary, If a range R(k, m) is defined by 


R(0, to) = F(x m ), R(n + 1, w) = 1 - F(x„ +1 _„) 

R(k, to) =■ F(Z* +n ) - F(zk) 

where k and m are positive integers such that m < n and k + m < n, its probability distribu- 
tion is independent of k, and hence equal to that of F(x m ). 

Theorem 2. Given a lest funtion of u < 

m 


where p is a real positive number, and the sum is for m indices chosen at random on the range 
1 to n + 1 - Let Y and a 1 denote the mean and variance of this lest function. Establish a 
convention for increasing the indices included in the above sum for increasing to as n increases, 
such that [to/(u + 1)] = constant, to nearest multiple of l/(n + 1). Then the asymptotic 
distribution of ( Y — Y)/c for inreasing n, subject to the above condition, is the normal dis- 
tribution with zero mean and unit variance, except m the trivial case m = n + 1, p = 1. 
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10. Confidence Limits for the Fraction of a Normal Population which Lies 
between Two Given Limits. Jacob Wolfowitz, Columbia University. 
(Read by title) 

Let it, • ■ • , be N independent observations from a normal population with mean ju 
and variance <r 2 , both unknown. Let Ni = 2 a, and (N — l)s* = 2 (a, — x ) 2 define £ and s 2 
Let Li and Li be given constants with L\ < L 2 , and let 



By a lower confidence limit on 7 with confidence coefficient at is meant a function D(x l , 
• • ■ , x N ) such that the probability is a that D < 7 Since £ and s 1 are sufficient estimates of 
ft and a 2 the restriction that D be a function of £ and s only is imposed It is assumed that 
there exist a) a positive d such that L\ + d < ft < Li — d; b) a positive C such that a < C. 
From these it follows that there exists a lower bound 0 = G(d, C) on 7 . Let Xi-« be that 
number for which P(x 2 < x*-«) = 1 — a, where x 2 has N — 1 degrees of freedom, and let 

w = — ^ it is shown that if D be defined as follows: 

Xl-a 

1 ) if Li < £ < Lt , 

r Li —xl w 

D = (2x) - i I exp [-Jy 1 ! dy 

J Li~x/ V) 

2) D «= G otherwise, then | P(£ < 7 ) - a | approaches zero as lV-» » . Thus D is a large 
sample lower confidence limit. The extension to upper and two-sided limits presents 
no difficulty. 


11. The Consolidated Doolittle Technique. Paul Boschan, Econometric 
Institute (Read by title) 

The quadratic matrix notation is interpreted as a segment in a sequence of matrices 
wherein each successor matrix is augmented by a bordering row and column. Extension 
theorems based on this idea date back into the last century. The step from the original 
concept to one of higher order is also fruitful in discussing inverse matrices, specifically 
the inverse of a symmetric matrix The symmetry of the matrix of normal equations for 
a set of multiple regression coefficients is restored by adding the transpose of the column 
on the right side of the equations, 1 e. the co-variances with the dependent variable and 
the variance of the dependent variable itself. The inverse of this matrix can be con- 
structed as partial sum over a series of matrices. Each individual clement of this senes 
is in itself meaningful. The solution for the set of multiple regression coefficients relating 
the fc-th variable to the preceding (ft - 1) variables is a column matnx. The product of 
this matrix with its transpose expressed in terms of the residual variance forms the ft-th 
term in the matrix senes. The summation of the first n products yields the inverse matrix. 
This characteristic of the inverse can be used to great advantage in the standardization 
of elementary computational steps 

12. Estimation of Structural Equations through Linear Transformation of 
Regression Coefficients. Theodore W. Anderson and Herman Rubin, 
Cowles Commission for Research in Economics. 

A method is presented for estimating the coefficients of a single structural- equation in 
a system By[ + Tz[ = u[ (I = 1, 2, • ■ • , T), where B and r are matrices of coefficients, y, 
is a row vector of G observed jointly dependent variables, z t of K observed predetermined 
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variables and u t of 0 random elements. Given the distribution of the random elements, 
the equations define the distribution of the yi . Some coordinates of 2 , may be coordinates 
of yi_i , etc. It is assumed that the structural equation to be estimated has at least O — l 
coefficients prescribed zero. The part of the population regression matrix corresponding to 
the predetermined variables with zero coefficients has rank one less than the number of 
jointly dependent variables with non-zero coefficients. The maximum likelihood estimate 
of this matrix is a linear transformation of the unrestricted sample regression matrix. The 
estimated vector of coefficients of yt is the vector annihilated by this matrix. The vector of 
coefficients of z, is estimated by means of this vector and the regression matrix. Those 
estimates are consistent and asymptotically normally distributed. For z, fixed, small 
sample confidence regions are given for the coefficients. 



NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest 

Personal Items 

Dr. Armen A. Alchian, who has been discharged from the Army with the rank 
of Captain, is now an Assistant Professor m the Economics Dept at the Uni- 
versity of California at Los Angeles. 

Dr. Franz L. Alt is now Assistant Director of Research at the Econometric 
Institute. 

Colonel Dinsmore Alter is on terminal leave after more than four years’ 
service in the Transportation Corps of the Army, and has returned to his duties 
as Director of the Griffith Observatory m Los Angeles. During these years 
Colonel Alter traveled approximately 250,000 miles on the ocean as a Trans- 
port Commander, visiting each continent except the Antarctic 

Dr Theodore W. Anderson, formerly with the Cowles Commission, is now an 
Instructor in the Dept of Math. Statistics at Columbia University, and plans 
to be on a Guggenheim Fellowship beginning in June 1947. 

Mr Herbert Barkan has been appointed to an Instructorship in the Newark 
College of Engineering 

Mr. Robert E. Bechhofer, formerly a statistician with The Kellex Corpora- 
tion, is a graduate student at Columbia University this year. 

Mr. Stanley G. Behrends is now Cost Accountant with the California Wire 
Cloth Corporation, in Oakland. 

Messrs. Carl A. Bennett, Jack I. Northam, and Max A. Woodbury have 
all returned from various types of war service to the University of Michigan 
as graduate students in statistics. Mr. Bennett was with the Manhattan En- 
gineering District for over two years, first at the Metallurgical Lab., University 
of Chicago, and then at Oak Ridge, Tenn. Mr. Northam was recently dis- 
charged from the Army with the rank of Lieutenant, having served with the 
Signal Corps for four years in the Pacific area. Mr. Woodbury was discharged 
from the Army with the rank of Captain, having been in the Meteorology serv- 
ice for five years, most of which time was spent in the European theater. 

Mr. Richard Berger has received his discharge from the Navy, and is employed 
as a Research Analyst with Dun and Bradstreet, 

Dr. Archie Blake, formerly at Aberdeen Proving Ground, is now Senior 
Statistician in the Office of the Army Surgeon General 

Dr. Ernest E. Blanche, who had been teaching in one of the European Army 
University Centers, is now Principal Administrative Analyst in the Plans and 
Policy Office of the War Department General Staff, and is also Lecturer at 
American University. 

Mr Royal F. Bloom has resigned the position which he held for a short time 
with the Psychology Dept, of Iowa State College after his release from the 
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Navy last March, and has returned to the Navy Department as Assistant Head 
of the Classification Research Division, Bureau of Naval Personnel. 

Mr Earl K. Bowen has been appointed to an Instructorship in statistics at 
Babson Institute of Business Administration 

Mr Albert II. Bowker is enrolled this year as a graduate student at the Uni- 
versity of North Carolina. 

Mr. Charles R. Brearty has joined the Technical Staff of Bell Telephone 
Laboratories, Inc. 

Mr Clyde A. Bndger is on leave from his position at the University of Utah, 
and is spending the year at the Institute of Statistics in Raleigh, North Carolina, 

Mr. Arthur W Brown, formerly with the Columbia University Division of 
War Research, is now with the Standard Oil Company of New Jersey. 

Dr George W. Brown, formerly connected with the RCA Laboratories at 
Princeton as Research Engineer, has accepted a position as Research Associate 
Professor in the Statistical Laboratory at Iowa State College. 

Mr. Richard H. Brown has been appointed to a Lectureship in Mathematics 
at Columbia University. 

Dr Richard S. Burington, Director of the Evaluation and Analysis Groups of 
the Research and Development Division of the Bureau of Ordnance, Navy 
Department, has been named Chief Mathematician, Bureau of Ordnance. 

Mr. Roy A. Chapman, who has been Silviculturist at the Hitchiti Exper- 
imental Forest, Round Oak, Georgia, is now with the U 8. Forest Service in 
Washington, D. C. 

Dr. Way Ming Chen has been appointed to an Instructorship in mathematics 
at Brown University. 

Dr John M. Clarkson has been promoted to a professorship at North Carolina 
State College. 

Mr. S. Lee Crump has been promoted to an Assistant Professorship at Iowa 
State College. 

Dr. Joseph F. Daly, formerly an Instructor at Catholic University, and more 
recently a Lieutenant in the Navy Department, is now Statistician with the 
Bureau of the Census. 

Dr. Daniel B. DeLury has been promoted to a professorship in. statistics at 
Virginia Polytechnic Institute. 

Dr. Acheson J. Duncan has been appointed to an associate professorship of 
political economy at The Johns Hopkins University, 

Dr Jack W, Dunlap, foimerly at Rochester University and more recently a 
Lieutenant Commander in the U. S. Navy, is now Director of the Division of 
Biomechanics of the Psychological Corporation. 

Mr. Francis B. Elmore has been discharged from the Army and is Quality 
Control Engineer at the Union Bag and Paper Company, in Savannah, Ga. 

Mr. Mark W. Eudey has returned from service to his former position with the 
Statistical Laboratory at the University of California. 
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Mr. Charles D Ferris, formerly at Aberdeen Proving Ground, is now Quality 
Control Engineer with the General Electric Company, in Bridgeport, Conn. 

Mr Lester R. Frankel is now a statistician with Dun and Bradstreet 

Mr John E. Freund has accepted a position as assistant professor of mathe- 
matics at Alfred University. 

Dr Bernard Friedman has been promoted to an assistant professorship at 
New York University. 

Dr. Milton Friedman has been appointed to an associate professorship in the 
Department of Economics at the University of Chicago. 

Mr. G Rupert Gause is now with the Technical Staff of the Bell Telephone 
Laboratories 

Professor Edwin L Godfrey has been appointed Head of the Department of 
Mathematics and Astronomy at Defiance College 

Dr Casper Goffman has been appointed to an assistant professorship in the 
Department of Mathematics at the University of Kentucky 

Mr Harry H. Goode is now a Mathematician in the Office of Research and 
Inventions, U. S. Navy 

Mr. Robert D Gordon is a Teaching Assistant in Mathematics at Indiana 
University. 

Mr Bert A Gottfried has returned from the service and is Research Analyst 
with Dun and Bradstreet. 

Dr. Clyde H. Graves, formerly at Pennsylvania State College, is now Opera- 
tions Branch Chief of the Office of Price Board Management, OPA. 

Dr. Joseph A Greenwood has recently been separated from active duty with 
the Navy and is now a statistician in the Bureau of Aeronautics. 

Mr. Harris T Guard has returned to Colorado A. and M as an Instructor m 
the Department of Mathematics. 

Dr Joy P. Guilford has returned to his former position as Professor of Psychol- 
ogy at the University of Southern California 

Prof. Emil J. Gumbel, formerly with the New School of Social Research, 
has been appointed to a Special Lectureship in Statistics at Newark College of 
Engineering 

Mr. Lee S. Gunlogson has been discharged from the Navy and is now m the 
statistical department of the Lumbermens Mutual Casualty Company, Chicago 

Dr Paul R. Halmos has been appointed to an assistant professorship m mathe- 
matics at the University of Chicago. 

Professor Preston C. Hammer has returned to his former position at Oregon 
State College. 

Mr. Joseph 0. Harrison, Jr. is now employed as a mathematician for the 
Harvard University Automatic Sequence Controlled Calculator Project m 
Cruft Laboratory 

Mr. Millard Hastay, formerly with the Statistical Research Group at Co- 
lumbia University, is now Research Associate at the National Bureau of Eco- 
nomic Research. 
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Mr. Bernard Hecht has been promoted from Chief Quality Control Engineer 
to Manager of the Quality Control Department of the International Resistance 
Company, Philadelphia. 

Mr. Joseph L. Hodges, Jr. has been appointed to a teaching assistantship in 
mathematics at the University of California 

Dr. Paul G. Hoel has boen promoted to an associate professorship in mathe- 
matics at the University of California at Los Angeles. 

Mr Richard A. Hornseth has been appointed to an lnstnictorship in the De- 
partment of Sociology and Anthropology at the University of Wisconsin. 

Mr. Harry M. Hughes has been appointed to a teaching assistantship at the 
University of California. 

Mr. Leonid Hurwicz, formerly with the Cowles Commission, lias been ap- 
pointed to an associate professorship at Iowa State College. 

Mr. Joseph B. Jeming lias been separated from service with the Air Forces, 
and is now a Financial and Economic Consultant in New York City. 

Mr Paul Johner has been discharged from the Army and is now in the Indus- 
trial Engineering Division of the Aluminum Company of America, New Kens- 
ington, Pa. 

Miss Margaret Kampschaefer, who is a statistician in the War Department, is 
now serving in the Supply Division of the Air Force Service Command in Erlan- 
gen, Germany. 

Dr. Leo Katz has been appointed to an assistant professorship at Michigan 
State College. 

Mr. Frederick G. King has been discharged from the Army and is now a 
civilian instructor in the Anti-Aircraft Artillery School at Fort Bliss. 

Dr. Tjalling Koopmans has been appointed Associate Professor of Economics 
at the University of Chicago. 

Mr. Paul J. Kopp has been discharged from the Army and is now with the 
Patent Department of the Gulf Oil Corporation, Washington, D, C. 

Dr. Carl F. Kossack has accepted a position as mathematician with the 
Joint Army-Navy Air Intelligence in the Strategic Vulnerability Branch, 

Dr Waclaw Kozakiewicz has been promoted to an assistant professorship 
in ihathematics at the University of Saskatchewan. 

Professor Rafael Laguardia has returned to Uruguay as Director of the In- 
stitute de Matematica y Estadistieu, Facultad de Ingenieria, 

Dr. Charles R. Langmuir, formerly with the Psychological Corporation, is 
now Secretary -Treasurer and Lab. Director of the Bennett and Langmuir 
Development Corporation, Mamaroneck, N. Y 

Mr. Charles M. Larson has accepted a position as mathematician with the 
Pacific Mutual Life Insurance Company, Los Angeles. 

Miss Lucy A. LaSala, formerly with the research group at Columbia Uni- 
versity, is now teacher of mathematics at East New York Vocational High 
School. 

Dr. Richard A. Leibler is now a Member of the Institute for Advanced Study, 
Princeton. 
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Miss Grace L Lesser, formerly with the research group at Columbia Uni- 
versity, is now employed as a statistician with the Econometric Institute, 

Miss Myra Levine has accepted a position as statistician with the Socony- 
Vacuum Oil Company, in New York City. 

Dr Jerome C R. Li has been appointed to an instructorship at Oregon State 
College. 

Professor William T. Martin has accepted a professorship m the Department 
of Mathematics at Massachusetts Institute of Technology. 

Miss Ethelyne L. McBee, formerly with the U. S. Department of Agriculture, 
is now teaching science and mathematics at the Falls Church High School, 
Falls Church, Virginia 

Dr. Paul W. McGann has been appointed to an assistant professorship in 
economics at American University. 

Dr Max F. Millikan has been appointed to a research associatesliip at Yale 
University 

Mr. Probodh C. Mittra has accepted a position as consulting statistician with 
the United Nations Economic and Social Council. 

Dr. Marjorie E Moore has transferred from her position as statistician with 
the Social Security Administration, to one as Program Analyst in the Office of 
Vocational Rehabilitation, Federal Security Agency. 

Miss Judith Moss, who was with the research group at Columbia University, 
is now research assistant with the National Bureau of Economic Research 

Dr Frederick Mosteller has been appointed to a lectureship and research 
associatesliip m the Department of Social Relations at Harvard University. 

Mr. James E. Myers, formerly w T ith the Naval Research Laboratory at Ana- 
costia Station, is now with the research group of the Moore School of Electrical 
Engineering, University of Pennsylvania 

Mr. Stanley W Nash is a graduate student this year at the University of 
California 

Professor J. Neyman is on leave from his position at the University of Cali- 
fornia for the fall semester, and is Visiting Professor of Mathematical Statistics 
at Columbia University 

Mr. Russell T. Nichols has been discharged from the Army, and is a graduate 
student at the University of Chicago 

Mr Harold Nisselson has been discharged from the Navy, and is now a statis- 
tician in the Bureau of the Census, where he was formerly employed. 

Professor Nilan Norris has been separated from his service with the Army, 
with the rank of Major, and has returned to his position m the Department of 
Economics at Hunter College 

Dr. Guy H. Orcutt, formerly at Massachusetts Institute of Technology, has 
accepted a research position in the Department of Applied Economics, Cam- 
bridge University. This new department is to be modelled somewhat along the 
lines of the Cowles Commission at the University of Chicago, and is to be under 
the direction of Dr. J. R N. Stone. 
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Mr. Warren H Page has been separated from seiviee with the Army, and is 
now a graduate student at Columbia University. 

Mr. Nicholas Pastore has been appointed to an instructorship at Union Junior 
College, Cranford, New Jersey. 

Mr. I. B. Perrott has been demobilized from the British Army with the rank 
of Major. 

Mr George W Petrie, III has accepted a position as Special Engineer with 
the Bethelchem Steel Company. 

Dr. Harry S. Pollard has been promoted to a professorship at Miami Uni- 
versity. 

Dr. G. Baley Price, Professoi of Mathematics at the University of Kansas, 
has been awarded a Post-Service Guggenheim Fellowship, beginning September 
1, 1946. 

Mr. Robert J Randall has been discharged from the Army and is now a 
graduate student at Columbia University 

Professor Lowell J. Reed, of the School of Hygiene and Public Health, The 
Johns Hopkins University, has been appointed Vice-President of the University. 

Dr. Francis Regan has been promoted to a professorship at St. Louis Uni- 
versity. 

Mrs. Kathryn B. Rolfe, formerly at the University of California at Berkeley, 
has accepted a position as associate in mathematics at the University or Cali- 
fornia College of Agriculture, at Davis. 

Mr Frank Saidel is a graduate student in mathematical statistics this year at 
Columbia University. 

Dr. Leonard J. Savage has been awarded a Special Rockefeller Fellowship, 
beginning September 1946. 

Professor Henry Scheffd of the University of California at Los Angeles has 
been awarded a Guggenheim Fellowship, and is spending the year at the Uni- 
versity of California at Berkeley 

Professor Andrew S. Schultz, Jr. has been separated from service with the 
Army and has returned to Cornell University with the rank of associate pro- 
fessor. 

Dr Saul B. Sells, formerly ivith the OPA, has accepted a position as Assistant 
to the President of the A. B. Frank Company, San Antonio. 

Mr Lawrence W. Shaw is now a statistician with the U S. Public Health 
Service in Bethesda. 

Dr. Ronald W. Shephard has been appointed to a lectureship at the Uni- 
versity of California, Berkeley. 

Mr Clifford R. Simms has accepted a position as manager of the Cleveland 
office of the B. E Wyatt Company. 

Mr George B Simon has been separated from Army service with the rank of 
major, and has accepted a civilian position as chief of the Analysis and Research 
Unit, Psychological Section, Office of Surgeon, Barksdale Field. 

Mr. Herbert Solomon has been appointed to an instructorship at the College 
of the City of New York. 

Mr. Melvin D. Springer has returned to the University of Illinois and has been 
appointed to an assistantship. 
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Mr. Andrew P. Stergion has been discharged from the Army, and is now Sta- 
tistical and Quality Control Engineer with the Corning Glass Works. 

Mr Milton S. Stevens has been discharged from the Navy, and has accepted 
a position as Director of Special Projects with Time, Inc 

Dr. George J, Stigler has been appointed to a professorship m economics at 
Brown University. 

Mr Alexander L Stott has been discharged from the Navy, and is now a staff 
assistant in the Treasury Department of the American Telephone and Tele- 
graph Company 

Dr. L V Toralballa has accepted a teaching position at Fordham University 

Dr. Walter It. Van Voorhis has returned to Fenn College, with the rank of 
associate professor. 

Mr. Edward H Van Winkle has been appointed to a professorship of business 
statistics at Rensselaer Polytechnic Institute 

Dr Charles W- Vickery has been appointed to an associate professorship at 
Ohio State University. 

Mr. David F Votaw, Jr. has been separated from service with the Navy and 
has returned to Princeton University as Research Associate 

Mr. W. Allen Wallis has been appointed to a professorship at the University 
of Chicago. 

Mr. Ralph E. Wareham is now managing director of the National Photocolor 
Corporation 

Dr Jacob Wolfowitz has been appointed to an associate professorship in 
mathematical statistics at Columbia University. 

Mr John F. Wyckoff, formerly at Trinity College, has accepted a position in 
the Research Division of the Actuarial Department, Connecticut General Life 
Insurance Company, Hartford. 

Mr. Earl K. Yost, Jr. has been appointed to a graduate assistantship in 
mathematics at the University of Oregon. 


A conference on applied mathematical statistics was held at Lake Junaluska, 
North Carolina, August 4-9, 1946 under the sponsorship of the Institute of 
Statistics of the University of North Carolina The following individuals at- 
tended the conference: C I. Bliss, W- G Cochran, Gertrude M. Cox, D. B. 
Duncan, C Eisenhart, R. A Fisher, Carl F Kossack, Frederick Mosteller, 
H. W Norton, Paul Peach, Charles F. Roos, Walter A. Shewhart, Frederick 
Stephan, Gerhard Tintner, John W. Tukey, S. S. Wilks, C. P. Wmsor, and J. 
Wolfowitz 


Newark College of Engineering is sponsoring a series of conferences on In- 
dustrial Statistics. The first of these, on Acceptance Sampling, began on Sep- 
tember 27 and ran for eleven four -hour Friday sessions. Among the members 
of the Advisory Panel on Industrial Statistics are Institute members S. B. 
Littauer, A. I, Peterson, W. A Shewhart, and S. S Wilks. 
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New Members 

The following persons have bean elected to membership in the Institute: 

Anscombe, F. J. Rotha mated Experimental Station, Harpenden, Herts, Eng. 

Back, Kurt W., M.A (California at L. A.) Stat , Surveillance Branch, Ballistic Rea Lab 
Aberdeen Proving Gd., Md, 

Bresnahan, Maurice F. Stat., U. S. Bur. of Labor Statistics, Wash., D. C., Apt. SOB, 1016 
N St , N.W., Wash. 1 

Chung, Kal-Lal, M.A. (Princeton) Graduate Coll., Princeton Untv., Princeton, N. J, 

Clarke, P. C. Asst Gen. Mgr., Hunter Pressed Steel Co., Lansdalo, Pa., Line Lexington 
Pa. 

Coon, Helen J., M.A. (Southern Methodist) Ballistic Res. Lab., Aberdeen Proving Gd.,Md. 

Copp, Warren F., B.S. (Ohio State) Supv., Quality Control Dept., Wheeling Steel Corp., 
Yorkville Works, Yorkville, Ohio 

Dlvatla, Vaslshtha V., B.Sc. (Bombay) Student in Math. Stat., Columbia Univ. ft 784 
John Jay Hall, Columbia Univ., N: Y. City 

Fanshaw, Hugh L., M.S (Manitoba) Standards Supv , Canadian Indus. Ltd., General 
Chemicals Div., Hamilton, Ont., Can., 1X0 St. Clair Ave. 

Ferlet, Kampe de, Dr Sci. (Paris) Professeur a la Faeulte des Sci. de l’Universite de Lille, 
10 rue des Jardins, Lille, France 

Fine, Clarence B., B.S.S. (G C.N.Y.) Economist, OPA, Wash., D. C., 1388 Tuckennan St., 
N.W., Wash. 11 

Golub, Abraham, B.A, (Brooklyn) Math., Ballistic Rob. Lab., Aberdeen Proving Gd., 
Md , Men's Dorm 

Gomberg, William, Ph.D. (Columbia) Dir. of Mgt., Engr. Dept , International Ladies 
Garment Workers Union, 1710 Broadway, N. Y,, N. Y,, 444 Beach 148ndSl., Neponsit, 
L I. 

Holton, Frederick J., Jr. Asst, to Pres., John Deere <fe Co , 230 S. Clark St., Chicago, 111., 
1314 Westview Rd., Highland Park 

Hansoil, Robert H., M.S. (Iowa) Stat., Bur. of the Census, Wash., D. C., 3148 Wes toner 
Dr., Wash. 80 

Hardy, Philip H., B.S. (Rico) Quality Engr., General Eloo. Co. ; on leave, Cpl. US Army, 
4000 BU Sq. S-, Wright Field, Dayton, Ohio 

Hasty, WUlls L., Jr , B.C.S. (Benjamin Franklin) Capt., Signal Corps, 8476 South Wake- 
field St., Arlington, Va. 

Hess, Ida I., A.B. (Indiana) Stat,, Population Div., Bur. of the Census, Wash., D. C., 
I486 Rhode Island Ave., N.W. 

Jacobson, Jack J., M.B.A. (Chicago) Stat., Spiegel, Inc., Chicago, HI., 3668 W. Palmer St. 

Janko, Prof. Jaroslav, Technical Univ , Prague, Czechoslovakia, Na bojisli S, Praha II 

Karp, Abraham E., M S. (C C.N.Y.) Stat., Aberdeen Proving Gd , Md., 85 Aberdeen Ave. 

Keefe, David P., B.S. (St. Thomas) Supv., Raw Material Testing, Minn. Mining and Mfg. 
Co., St. Paul 6, Minn,, 690 Holly Ave., St Paul 

Kellogg, Lester S., M.A. (Northwestern) Chief, Prices and CoBt of Living Branoh, Bur. 
of Labor Statistics, U. S. Dept, of Labor, Wash., D. C., 404 Shady Lane, Palls Church, 
Va. 

Kindlg, Fred E., B.S. (Pennsylvania State) Industrial Math., WestinghouBe Elec. Co., 
Braddook Ave., E. Pittsburgh, Pa., S3 Nantucket Dr., R.D. 6, Pleasant Hills, Pitts- 
burgh 10 
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REPORT ON THE ITHACA MEETING OF THE INSTITUTE 

The Ninth Summer Meeting of the Institute of Mathematical Statistics was 
held at Cornell University, Ithaca, New York, on Thursday, August 21, and 
Saturday, August 23, 1946. The meeting was held in conjunction with the 
summer meetings of the American Mathematical Society and the Mathematical 
Association of America. The following 71 members of the Institute attended 
the meeting: 

P. L. Alger, C. B Abend oerfor, T. W. Anderson, Jr., J L. Barnes, E. E. Blanche, Pau^ 
Boschan, A H. Bowker, A. E. Brandt, R H Burington, W. G. Cochran, E P. Colemam 
H. B. Curry, J. II Curtiss, J. L. Doob, J Dutka, P. S Dwyer, B Epstein, Will Feller, C- 
D Ferris, R. M Foster, J. E. Freund, M. A Girschick, A. A Goodman, Louis Guttinan, 
W. W. Gutzman, P R Halmos, T. E Harris, Bertha I. Hart, E. II C. Hildebrandt, P. G. 
Hoel, R. II. Hoskins, Harold Hotelling, W W. Jacobs, T. J. Jaramillo, Evan Johnson, Jr., 
H, L. Jones, Mark Kac, Irving Kaplansky, Leo KarL, Tjalling Koopmans, G. F. Kossack, 
M. M. Lavin, Walter Leighton, Jr., Howard Levenc, M. S M&epliail, J. W. Mauohly, P. J. 
McCarthy, E. C. Molina, Margaret E. Moore, J. E. Morton, L F. Nanni, P. M. Neurath, 
E, G, Olds, G. B Price, C. J. ltees, Selby Robinson, Herman Rubin, P J. Rulon, Arthur 
Sard, F. E, Satterthwaite, I. E, Segal, G. R. Seth, Andrew Sobczyk, Herbert Solomon, C M. 
Stein, F. F, Stephan, A, P. Storgion, A. W, Tucker, J. W. Tukey, J L, Ullman, Abraham 
Wald, S S. Wilke. 

The first session, a joint session with the American Mathematical Society, 
was held on Thursday morning, and was devoted to contributed papers. Pro- 
fessor W. G. Cochran, President of the Institute, presided. The following 
seven papers were presented: 

1. A Test of Randomness in Two Dimensions. 

Mr. Howard Levonc, Columbia University. 

2. Asymptotic Distribution of Moments from a System of Linear Stochastic Difference 

Equations 

Mr, Herman Rubin, Cowles Commission for Research in Economics. 

3 Conditional Expectation and Unbiased Sequential Estimation. 

Professor David Blackwell, Howard University. 

4. A Discussion of the Ehrenfest Model Preliminary report. 

Professor Mark Kao, Cornell University 

6. Sampling from Contaminated Distributions. Preliminary report. 

Professor John W, Tukey, Pnncoton University. 

6 On the Class of Functions Defined by the Difference Equation (as + 1) f{x + 1) = 

(o + bx) f{%). 

Dr Leo Katz, Wayne University. 

7. Retention of Decimal Places m Matrix Calculations 

Dr. Franklin E. Satterthwaite, Aetna Life Insurance Company. 

The following four papers wore presented by title. 

8. The Efficiency of the Mean Moving Range. 

Professor Paul G. Hoel, University of Califorma at Los Angeles. 

9. Some Basic Theorems for Developing Teels of Fit for the Case of the Non-Paramelric 

Probability Distribution Function. 

Mr. Bradford F, Kimball, N. Y, State Department of Public Service, New York 

City. 
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10 Confidence Limits for the Fraction of a Normal Population Which Lies Between Two 

Given Limits 

Professor Jacob Wolfowitz, Columbia University. 

11 The Consolidated Doolittle Technique 

Dr Paiil Boschan, The Econometric Institute, Inc. 

Abstracts of all these papers appear elsewhere in this issue of the Annals. 

At two o’clock on Thursday afternoon there was a joint session with the Ameri- 
can Mathematical Society which featured the invited address of Professor J. 
L. Doob of the University of Illinois on Probability in Function Space. This 
address was followed by a business meeting of the Institute which featured 
reports by the President, the Secretary-Treasurer, the Editor, and Professor 
Feller, who spoke for the recently created committee on the distribution of the 
Annals in the war areas. 

On Thursday evening there was a joint dinner with the American Mathemati- 
cal Society and the Mathematical Association of America. 

The meeting closed with a session on Friday morning devoted to the topic, 
Multivariate Analysis for N on-Experimental Data Professor Will Feller, of 
Cornell University, presided. Professor T. Koopmans, of the Cowles Com- 
mission for Research in Economics, presented a paper entitled Statistical Infer- 
ence m Dynamic Economic Models. Dr T W. Anderson, Jr. presented a paper 
written by himself and Mr. Herman Rubin entitled Estimation of Structural 
Equations through Linear Transformation of Regression Coefficients. The meet- 
ing concluded with a discussion of these papers 

P. S. Dwyer, 
Secretary. 



REPORT OF THE PRINCETON MEETING OF THE INSTITUTE 


The twenty-third meeting of the Institute of Mathematical Statistics was 
held in Princeton, New Jersey on Friday, November 1 , 1946, in connection with 
the year-long Celebration of the Bicentennial of Princeton University. The 
meeting was devoted entirely to Analysis of Variance, The meeting was at- 
tended by 118 persons including the following 96 members of the Institute: 

Adam Abruzzi, Forman S. Acton, It. L. Anderson, T. W. Anderson, Jr., M. S. Bartlett, 
Robert Bechofor, Gilbert W. Beebe, J. H. Bigelow, Archie Blake, C I. Bliss, A. E. Brandt, 
Burton H. Camp, George C. Campbell, A. George Carlton, Kai Lai Chung, W G. Cochran, 
Gertrude Cox, Harold Cramdr, S. Lee Crump, J, H, Curtiss, Joseph F. Daly, Besse B. Day, 
D B DeLury, V V. Divatia, J. Dutka, Churchill Eisenhart, B. EpBtoin, H. L. Fanshaw, 
Nicholas Fattu, Will Feller, Merrill M, Flood, Bernard Friedman, Hilda Geiringer, H. II 
Goldstino, Joseph A, Greenwood, E. J. Gumbel, Margaret Gurney, L. Gutmann, T. E. 
Harris, Millard Ilastay, Irwin S. Iloffcr, C J. Kirchen, B. F. Kimball, Lila F. Knudsen, 
H. S. Konijn, Jack Laderman, J. D Maddrill, Sophie Marcuse, H. C. Matliisen, J. W. 
Mauchly, Margaret Merrell, Elmer B Mode, Margaret E. Moore, J E. Morton, Judith 
Moss, F Mosteller, Charles M. Mottley, Itay B. Murphy, P. M. Neurath, Hugo Nilson, 
Gottfried E Noether, Monroe L. Norden, H. W. Norton, C O. Oakley, P. S. OlmBtead, 
J.G Osborne, Ellia It Ott,C, J Itoos, W. A. Reynolds, A. C. Rosander, David Rosenblatt, 
Ernest Rubin, P. U. Rulon, Frank Saidcl, Marian M. Sandomire, Walter A. Shewhart, 
Jamos G. Smith, Milton Sobol, Herbert Solomon, Mortiner Spiogolmon, Charles M. Stein, 
G. R. Stibitz, John It. Tomlinson, Marion M. Torrey, John W. Tukey, D. V, Votaw, Jr., 
F. M. Wadloy, Alton J. Wadmau, A. Wald, Robert M. Walter, Lionel Weiss, Frank Wil- 
coxon, S, S. Wilks, C, P, Winsor, J. Wolfowitz, and W. J. Youdan. 

At the morning session the following program was presented with Professor 
S. S. Wilks of Princeton University as chairman: 

Topic Mathematical Approaches lo the Analysis of Variance 

Papers: Two Probability Models for the Analysis of Variance 

Professor A. Wald, Columbia University 
Applications of Analysis of Variance 

Professor M. S. Bartlett, Cambridge University and The University of 
North Carolina 

Discussion - Professor S. L. Crump, Iowa State College 
Dr. J, F. Daly, Bureau of the Census 
Professor J. W. Tukey, Pnneeton University 
Professor C. P. Winaor, Johns Hopkins University 
Professor J Wolfowitz, Columbia University 

The program for the afternoon session, under the chairmanship of Professor 
Will Feller of Cornell University, was as follows: 

Topic: Multivariate Problems in the Analysis of Variance 

Papers : Analysis of Covariance 

Professor W. G, Cochran, The University of North Carolina 
Vector Methods 

Professor J W. Tukey, Pnneeton University 
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Discussion. Professor T W Andoison, Columbia University 
Professor C I. Bliss, Yale University 

Professor Harold Cramdr, The University of Stockholm and Princeton 
Umversity 

ProfessorD. B.DeLury, Virginia Polytechnic Institute 
Professor P L. Hsu, The University of North Carolina 

The evening session consisted of round table discussion on Unsolved Problems 
of the Analysis of Variance, with Professor Gertrude M. Cox as chairman. 

Members of the Institute and others who attended the meeting were guests 
of the Institute for Advanced Study at tea in Fuld Hall from 4 to 6 P.M. Those 
attending the evening session were guests of Princeton members of the Institute 
for refreshments in Fine Hall from 10 to 11 P.M. 

P. S. Dwyer, 
Secretary. 



MEMBERS OF THE INSTITUTE OF MATHEMATICAL STATISTICS* 


(As of October 1, 19$) 
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Allen, Prof. Roy G. D. D Sc. (London) London School of Econ , Houghton St., Aldwych, 
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Allendoerfer, Prof. Carl B. Ph.D. (Princeton) Haverford Coll., Haverford, Pa., 750 
Rugby Rd., Bryn Mam 

Alt, Franz L, Ph.D. (Vienna) Asst Dir. of Res., Econometric Inst., 600 Fifth Ave., 
N. Y 18, N Y., 271 Fort Washington Ave , N Y. 82 
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Anderson, Paul H. Ph.D (Illinois) Economist, War Assets Adm., Wash,, D. C , 1228 
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Anderson, Asso. Prof. Richard L. Ph.D. (Iowa State) Inst, of Stat., N. C. State Coll., 
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Arnold, Prof. Herbert E. Ph D (Yale) Wesleyan Univ., Middletown, Conn , 167 High 
Si 

Arnold, Asst. Prof. Kenneth J. Ph.D. (Mass Inst Tech.) Dept, of Math., Univ. of 
Wis., Madison 6, Wis , 7 38 E Johnson Si , Madison S 
Aroian, Leo A. PhD. (Michigan) Instr., Hunter Coll , N Y., N. Y , 21,7 Wadsworth 
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Arrow, Kenneth J. M A, (Columbia) Lydig Fellow, Columbia Univ., N. Y. 27, N. Y., 
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* Members were asked to supply fresh information for this Directory. The name is 
followed by highest degree and Institution granting it. Then follow the professional and 
business connections of the member, with business address, and finally (in italics) the home 
or mail address. When an address is known to be in error it is followed by (last address). 
Changes in addresses or errors in names, titles, or addresses, should be reported to the 
Secretary 
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Barkan, Herbert M A. (Columbia) Instr , Newark Coll, of Engineering, Newark, N. J., 
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Barnes, Prof. John L. Ph.D (Princeton) Chm , Dept, of Applied Math , Tufts Coll., 
Medford 55, Mass., 16 Ardley Rd., Winchester 
Barr, Prof. Arvll S. PhD (Wisconsin) Univ of Wib., Madison, Wis. 

Barral-Souto, Prof. Jose Sc D, (Buenos Aires) Umv of Buenos Aires, Buenos Aires, 
Argentina, Cordoba 1469 
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Omaha 8, Nebr 

Beckstead, Gordon L. MS (Michigan) Consultant, Dietary Labs , San Diego, Calif., 
and Grad Student, Univ. of Calif., Berkeley, 57 18 Huntington, Richmond 
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